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ABSTRACT 


The  goal  of  the  present  research  was  to  articulate,  develop  and  assess  an  object-based 
attention  framework  for  researching  and  designing  HUD  symbology.  The  theoretical, 
empirical  and  neuropsychological  foundations  of  an  object-based  attention  framework  are 
summarized  and  linked  to  research  on  HUD  symbology.  It  is  shown  that  an  object-based 
attention  framework  can  be  used  to  address  and  examine  a  variety  of  core  issues 
concerning  the  use  of  HUDs  in  aircraft.  A  series  of  five  experiments  are  presented  in 
which  a  framework  is  established  for  examining  object-based  attention  effects  in  yisual 
displays. 
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OBJECT-BASED  ATTENTION  AND  THE  DEVELOPMENT  OF 

HEADS-UP  DISPLAYS 

A  variety  of  helmet  mounted  displays  (HMDs)  have  been  proposed  to  aid  pilots  when 
flying  in  degraded  visual  conditions.  These  include  light  intensifying  night  vision 
goggles  (NVGs),  forward  looking  infrared  (FLIR)  systems,  and  enhanced  synthetic  visual 
systems  (ESVS).  In  all  of  these  devices,  there  is  a  requirement  to  include  heads-up 
displays  (HUDs)  where  symbology  concerning  aircraft  orientation,  system  status,  and 
energy  state  is  projected  onto  the  HMD.  It  is  not  at  all  clear,  however,  how  HUD 
symbologies  should  be  represented  on  HMDs  nor  how  these  symbologies  should  be 
configured,  grouped,  and  referenced.  One  reason  for  this  is  that  HUD  symbology  has 
often  been  researched  and  developed  in  a  theoretical  vacuum  and  without  due 
consideration  of  human  perceptual/cognitive  abilities  and  limitations. 

The  goal  of  the  present  research  was  to  articulate,  develop  and  assess  an  object-based 
attention  framework  for  researching  and  designing  HUD  symbology.  Section  one  of  this 
report  is  devoted  to  the  theoretical  and  empirical  foundations  of  the  object-based  attention 
framework.  In  Section  Two,  a  series  of  experiments  are  presented  in  which  a  framework 
is  established  for  examining  object-based  attention  effects  in  visual  displays. 

SECTION  ONE:  THEORETICAL  AND  EMIPRICAL 
FOUNDATIONS  OF  AN  OBJECT-BASED  ATTENTION 

FRAMEWORK 

1.  ATTENTION  AND  OBJECT  PERCEPTION  IN  HEADS-UP  DISPLAYS 
1.1  Head-Up  Displays  (HUDs) 

Traditionally  aircraft  have  been  equipped  solely  with  head-down  displays  (HDDs) 
where  the  aircraft’s  cockpit  instrumentation  (the  “near  domain”)  is  located  about  10° 
below  the  pilot’s  forward  field  of  view  (the  “far  domain”).  This  configuration  precludes 
the  possibility  of  pilots  simultaneously  foveating  the  aircraft’s  instrumentation  and  the 
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outside  scene.  There  are  many  missions,  however,  when  pilots  need  to  maximize  time 
looking  outside  the  cockpit  (“eyes  out”  time)  while  also  closely  monitoring  the  aircraft’s 
instrumentation  for  flight  and  power  information.  For  example,  during  low-level  flight 
pilots  need  to  concentrate  on  the  external  scene  while  frequently  cross-checking  the 
cockpit  instrumentation  for  altitude,  airspeed,  as  well  as  power,  navigational,  and 
communication  information.  Switching  between  the  external  scene  and  the  cockpit 
instrumentation  is  time  consuming  and  effortful  and  is  likely  to  diminish  the  pilot’s 
situational  awareness:  The  information  processing  stream  is  interrupted  during  ocular 
saccades  and  often  during  rapid  head  movements,  thereby  inhibiting  pilots’  ability  to 
build  up  a  stable  and  accurate  mental  representation  of  the  flight  environment. 

The  problem  of  switching  viewpoint  between  the  outside  scene  and  an  HDD  is 
often  exacerbated  when  pilots  are  equipped  with  HMDs.  For  example,  with  binocular 
NVGs,  pilots  must  look  under  the  goggles  in  order  to  view  the  cockpit  instrumentation. 
Re-accommodation  of  the  eye  is  required  when  the  pilot  switches  fixation  between  the 
NVG  far  domain  and  the  near  domain  of  the  HDD.  For  fully  immersive  HMDs,  such  as 
those  proposed  for  ESVS,  pilots  do  not  have  visual  access  to  the  cockpit  instrumentation. 

HUDs  of  instrumentation  symbology  have  been  developed  as  an  alternative  to  the 
traditional  HDD.  BTIDs  are  either  located  in  a  panel  that  is  fixed  in  the  forward,  heads- 
up,  view  of  the  pilot  or  superimposed  on  a  HMD.  A  presumed  advantage  of  HUDs  is  that 
pilots  should  be  able  to  simultaneously  access  the  instrumentation  symbology  while 
looking  out  at  the  external  scene.  Accordingly,  HUDs  should  enhance  situational 
awareness  by  eliminating  (or  greatly  reducing)  the  need  for  pilots  to  repeatedly  switch 
between  a  head-up/eyes-out  viewpoint  to  a  head-down/eyes-in  viewpoint.  To  this  end, 
numerous  studies  have  shown  a  benefit  of  HUDs  over  the  traditional  head  down  displays 
(HDDs).  For  example,  McCann  and  Foyle  (1996)  showed  that  (a)  pilots  are  able  to 
control  an  aircraft’s  flight-path  better  with  BDUDs  than  with  standard  HDDs  and  (b)  pilots 
are  faster  in  responding  to  events  located  in  either  the  far  or  the  near  domain  when  using 
a  HUD  rather  than  a  HDD.  There  is  also  some  evidence  that  pilots  are  better  at  tracking 
flight  guidance  symbology  such  as  airspeed  and  altitude  with  HUDs  than  with  HDDs 
(Martin-Emerson  &  Wickens,  1997). 


1.2  Attentional/Cognitive  Tunnelling 

Although  HUDs  may  enhance  performance  by  allowing  pilots  to  maintain  a 
heads-up  field  of  view,  there  are  also  situations  were  HUDs  have  been  shown  to  result  in 
performance  decrements  due  to  attentional/cognitive  tunnelling.  Some  of  these  attention- 
based  performance  decrements  have  serious  implications  for  flight  safety. 

Fisher,  Haines  and  Price  (1980)  described  a  simulator-based  experiment  where 
pilots  were  required  to  perform  runway  approaches  flying  an  aircraft  that  was  equipped 
with  a  HUD  versus  a  traditional  HDD.  It  was  found  that  some  of  the  pilots  using  the 
HUD  failed  to  notice  unexpected  intrusions  on  the  runway  when  they  were  also  required 
to  attend  to  events  in  the  near  domain  (see  also  McCann  &  Foyle,  1996).  This  failure  to 
notice  runway  intrusions  was  not  experienced  by  pilots  using  the  HDD.  Although 
suggestive,  the  Fisher  et  al.  study  was  flawed  in  that  the  location  of  the  instrumentation 
(HUD  vs.  HDD)  was  confounded  with  the  type  of  instrumentation. 

Wickens  and  Long  (1994)  repeated  the  Fisher  et  al.  (1980)  experiment  but  with 
matched  instrumentation  across  the  HUD  and  HDD.  In  contrast  to  the  Fisher  et  al. 
findings,  pilots  using  the  HUD  were  successful  in  noticing  runway  intrusions.  However, 
these  pilots  were  considerably  (2.5  sec.)  slower  to  respond  to  intrusions  than  were  pilots 
using  the  HDD.  In  sum,  both  the  Fisher  et  al.  (detection  accuracy)  and  the  Wickens  and 
Long  (detection  time)  studies  show  a  disadvantage  for  HUDs  versus  HDDs.  This 
disadvantage  seems  to  arise  in  situations  where  the  pilot  has  to  simultaneously  attend  to 
information  located  on  the  HUD  and  in  the  external  scene. 

The  disadvantage  of  HUDs  is  further  illustrated  in  a  study  by  Foyle,  Stanford  and 
McCann  (1991)  who  required  pilots  to  control  their  flight-path  while  maintaining  a  fixed 
altitude.  Superimposing  a  HUD  digital  readout  of  altitude  onto  the  flight  path  resulted  in 
excellent  control  of  altitude.  However,  when  focusing  on  altitude,  pilots  tended  to  collide 
with  the  flight-path  markers,  such  as  buildings  or  landmarks.  This  trade-off  between 
using  the  HUD  symbology  (digital  altitude)  and  processing  of  the  external  scene  cannot 
be  attributed  to  visual  interference  or  masking:  The  same  HUD  symbology  was  presented 
across  the  various  conditions.  Instead,  this  evidence  suggests  that  when  the  HUD 
symbology  is  required  for  performance,  the  symbology  tunnels  the  pilot’s  attention  at  the 
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cost  of  unattending  to  object  and  events  in  the  environment.  That  is,  when  focusing 
attention  on  one  domain,  information  located  on  the  other  domain  tend  to  go  unnoticed. 
This  phenomenon  has  been  labelled  cognitive  tunnelling  (Martin-Emerson  &  Wickens, 
1997;  Wickens  &  Long,  1995).  Cognitive  tunnelling  with  HUDs  is  driven  by  attentional 
mechanisms  where  a  pilot’s  awareness  of  the  far  domain  (external_scene)  is  reduced 
when  attending  to  the  near  (HUD)  domain.  Accordingly,  insight  into  the  mechanisms 
underlying  cognitive  tunnelling  on  HUD  symbology  (and  how  to  prevent  tunnelling) 
requires  an  understanding  of  the  role  that  attention  plays  in  human  information 
processing. 

1.3  Attention  And  The  Perceptual  Grouping  Of  Symbology  Into  Objects 

A  growing  body  of  research  in  the  HUD  literature  can  be  linked  to  evidence  from 
basic  research  showing  that  attention  is  referenced  to  perceptual  groups  or  objects  within 
the  visual  field  (Duncan,  1984;  Baylis  &  Driver,  1989;  Kramer  &  Jacobson,  1991).  This 
is  known  as  the  object-based  attention  hypothesis. 

It  is  generally  accepted  that  a  perceptual  object  or  group  is  formed  according  to 
the  Gestalt  grouping  principles  of  motion,  colour,  proximity,  closure  and/or  figure- 
ground  separation  (Koffka,  1935).  According  to  this  definition,  objects  can  be  something 
as  simple  as  features  moving  together  against  a  background  of  static  features  (Baylis  & 
Driver,  1988),  or  an  object  can  be  formed  by  a  simple  pattern  that  is  determined  by 
closure,  proximity  or  colour.  A  perceptual  object  can  also  consist  of  a  more  coherent 
form  involving  more  than  one  grouping  principle. 

Research  has  shown  that  it  is  easier  to  attend  to  a  single  object  within  the  visual 
field  than  it  is  to  divide  attention  between  two  separate  objects.  Attending  to  a  single 
object  may  also  allow  for  parallel  processing  of  all  features  of  that  object,  whereas 
features  belonging  to  two  separate  objects  are  processed  serially  (Baylis  &  Driver,  1993; 
Goldsmith,  1998).  This  evidence  for  object-based  attention  has  led  researchers  to  claim 
that  the  attentional  problems  experienced  with  HUDs  are  due  to  the  near  and  far  domains 
forming  separate  perceptual  groups.  This  claim  is  predicated  on  the  notion  that  near  and 
far  domains  differ  along  one  or  more  of  the  Gestalt  grouping  principles.  In  particular,  the 
HUD  symbology  is  stationary  relative  to  the  pilot-  or  aircraft-centric  view,  whereas  the 
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external  scene  is  in  constant  motion.  Also,  HUD  symbology  is  usually  displayed  in  a 
uniform  colour,  which  may  differ  from  the  various  colours  of  the  external  scene. 

The  claim  that  the  near  (HUD)  and  far  (external  scene)  domains  form  separate 
perceptual  groups  provides  a  possible  explanation  of  cognitive  tunnelling  when  combined 
with  the  object-based  hypothesis.  On  this  view,  when  pilots  attend  to  the  near  domain,  all 
of  the  HUD  symbols  get  processed  quickly  in  parallel  while  processing  of  information  in 
the  far  domain  is  delayed. 

Martin-Emerson  and  Wickens  (1997)  examined  the  difference  in  the  use  of  HUDs 
versus  HDDs  across  different  visibility  levels.  As  pilots  came  in  for  an  approach  to  land 
under  different  visibility  conditions  they  had  to  hold  a  stable  altitude  and  control  lateral 
and  vertical  tracking.  Flight  path  guidance  was  superimposed  onto  the  path  for  the  HUD 
condition  and  located  below  the  windshield  for  the  HDD  condition.  The  results  showed 
that  for  the  HUD  condition  pilots  were  faster  to  respond  to  events  within  the  HUD 
display  and  to  control  altitude  when  under  zero  visibility  as  compared  to  full  visibility 
conditions.  In  full  visibility  pilots  attend  to  the  far  domain,  thereby  making  it  more 
difficult  to  control  the  altitude  and  respond  to  events  occurring  in  the  HUD.  However, 
lateral  and  vertical  tracking  errors  also  decreased  in  the  full  visibility  condition. 
According  to  Martin-Emerson  and  Wickens  this  was  due  to  pilots  switching  attention  to 
the  far  domain  in  the  full  visibility  condition.  However,  because  pilots  had  to  hold 
altitude  and  respond  to  events  within  the  near  domain,  it  is  unlikely  that  they  switched 
attention  completely  toward  the  far  domain.  Indeed,  it  is  possible  that  the  pilots 
successfully  integrated  the  HUD  flight-path  symbology  with  the  external  environment  but 
experienced  difficulty  attending  to  other  information  located  in  the  HUD.  That  is, 
although  this  evidence  fits  with  the  object-based  hypothesis  that  attention  is  difficult  to 
divide  across  near  versus  far  domains,  it  can  also  be  interpreted  as  indicating  a  difficulty 
in  integrating  information  within  a  single  domain. 

McCann,  Foyle  &  Johnston’s  (1993)  finding  that  responses  to  targets  were 
significantly  delayed  when  a  cue  was  presented  to  the  nontarget  domain  could  be 
interpreted  as  showing  that  near  (HUD)  and  far  (external  scene)  domains  form  separate 
visual  objects.  In  accord  with  the  object-based  attention  hypothesis,  target  responses  are 
slower  when  the  cue  occurs  in  the  nontarget  domain  because  it  takes  time  to  switch 
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attention  from  one  object  (domain)  to  the  other.  However,  a  careful  review  of  the 
McCann,  Foyle  &  Johnston  procedure  leads  to  an  alternative  explanation.  In  this 
experiment,  pilots  were  required  to  perform  an  approach  to  land.  As  they  approached 
landing,  a  three-letter  cue  (“IFR”  or  “VFR”)  appeared  either  on  the  HUD  or  on  the 
runway  (external  scene).  The  cue  indicated  where  to  look  for  a  target  among  several 
geometric  symbols  that  appeared  on  both  the  HUD  symbology  set  and  the  runway.  “IFR” 
(for  instrument  flight  rules)  indicated  that  the  set  of  symbols  on  HUD  was  relevant. 
“VFR”  (for  visual  flight  rules)  indicated  that  the  set  of  symbols  located  on  the  runway 
was  relevant.  The  pilots  were  required  to  identify  whether  one  of  the  symbols  (the  target) 
was  a  diamond  or  a  stop  sign:  A  landing  was  allowed  only  if  the  target  was  a  diamond. 
Four  boxes  were  located  on  the  HUD  to  flank  either  side  of  the  runway  and  another  four 
boxes  were  superimposed  onto  the  far  domain  in  a  similar  position.  The  distance  between 
the  boxes  was  equal  for  both  domains.  The  three  geometric  symbols  appeared  in  three  of 
the  boxes  on  each  domain  and  the  cue  would  appear  in  the  fourth  box  on  either  the  near 
or  the  far  domain.  If  the  cue  appeared  on  the  HUD  it  filled  the  box  in  either  the  bottom 
left  or  the  bottom  right  comer.  If  the  cue  appeared  on  the  mnway  it  filled  either  the  top 
left  or  the  top  right  box. 

The  results  showed  that  subjects  were  significantly  slower  in  responding  to  the 
relevant  target  when  the  target  and  the  cue  where  located  on  different  domains.  For 
example,  when  the  “IFR”  (indicating  that  target  on  the  HUD  is  the  relevant  one)  appeared 
in  the  display,  pilots  were  faster  to  respond  to  the  target  when  it  was  located  on  the  HUD 
than  on  the  mnway. 

As  noted  above,  the  object-based  attention  interpretation  of  the  McCann,  Foyle  & 
Johnston  (1993)  result  is  that  the  near  and  far  domains  form  separate  visual  objects:  It 
takes  time  to  switch  attention  from  one  object  to  the  other.  However,  another  plausible 
interpretation  is  that  the  slower  responses  in  the  cross-domain  condition  are  due  to  there 
being  two  different  types  of  cues.  The  sudden  onset  of  the  three  letters  is  a  form  of 
exogenous  cueing  that  immediately  draws  attention  to  that  location.  In  contrast,  the 
interpretation  of  the  three  letters  is  a  form  of  endogenous  cueing:  the  participants  had  to 
interpret  the  meaning  of  the  three  letters  to  determine  the  relevant  target  location. 
Attentional  allocation  is  much  slower  with  endogenous  cues  than  with  exogenous  cues: 
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whereas  attentional  allocation  can  occur  within  100  ms  with  exogenous  cues  (Wright  & 
Ward,  1998),  the  allocation  of  attention  with  endogenous  cues  can  require  300  ms  and 
longer  (Stelmach,  Campsall,  &  Herdman,  1997).  On  this  view,  when  “IFR”  was  shown 
on  the  HUD,  the  pilots  would  be  able  to  determine  almost  immediately  whether  the  target 
on  the  HUD  was  a  diamond  or  stop  sign;  there  would  be  no  need  to  interpret  the  cue 
itself.  Accordingly,  when  the  symbolic  cue  concurred  with  the  direct  cue,  pilots  were  fast 
to  respond.  When  the  symbolic  cue  did  not  concur  with  the  direct  cue  (a  different 
location  of  the  relevant  target  was  indicated  versus  the  direct  cue),  then  responses  were 
slow. 

In  sum,  the  Martin-Emerson  and  Wickens  (1997)  and  the  McCann,  Foyle  & 
Johnston  (1993)  experiments  suggest  that  attention  limits  the  ability  of  pilots  to  process 
HUD  symbology  simultaneously  with  information  in  the  external  scene.  The  results  of 
these  experiments  have  been  explained  using  an  object-based  attention  hypothesis,  on  the 
assumption  that  the  near  and  far  domains  form  separate  perceptual  groupings.  However, 
lack  of  definition  of  the  concepts  involved  in  the  object-based  hypothesis  makes  it  hard  to 
avoid  possible  confusion  with  other  factors  that  may  influence  attention.  For  example,  the 
results  in  the  McCann  et  al.  study  can  be  interpreted  in  terms  of  spatial  cueing  rather  than 
perceptual  grouping.  Also  focusing  entirely  on  near  versus  far  domains  as  the  relevant 
“objects”  in  attentional  control  limits  the  application  of  the  object-based  attention 
hypothesis.  To  wit,  in  the  Martin-Emerson  and  Wickens  study,  performance  changes 
may  be  attributed  to  attentional  difficulties  in  integrating  information  within  the  near 
domain.  It  is  important,  therefore,  to  further  develop  the  object-based  framework  in  order 
to  systematically  study  the  information  processing  difficulties  in  HUDs. 

1.4  Conformal  Symbology 

As  mentioned  above,  most  research  on  HUD  symbology  has  proceeded  on  the 
assumption  that  the  HUD  symbology  and  the  external  scene  form  two  distinct  perceptual 
groups  or  domains.  Accordingly,  it  has  been  suggested  that  fusing  the  BnJD  and  external 
scenes  may  provide  a  way  to  facilitate  the  dividing  of  attention  between  the  domains. 

One  way  of  doing  this  is  to  use  conformal  symbology.  Broadly  speaking,  the  definition  of 
conformality  used  by  HUD  developers  refers  to  the  degree  to  which  a  symbol  forms  an 
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object  within  the  scenery.  The  idea  is  that  a  conformal  symbol  should  serve  as  a  virtual 
analog  for  far  domain  elements.  In  other  words,  symbology  that  is  an  accurate  graphic 
representation  of  an  actual  object  represented  in  the  far  domain,  or  that  forms  a  one-to- 
one  coiTespondence  with  the  world  is  deemed  to  be  conformal  (Martin-Emerson  & 
Wickens,  19-97).  On  this  view,  conformal  symbology  can  be  a  virtual  runway  overlaying 
the  actual  runway  or  a  scene-linked  symbology  where,  for  example,  altitude  is 
represented  at  the  height  of  and  possibly  in  actual  objects  in  the  far  domain.  Non- 
conformal  symbology  would  be  symbols  such  as  a  digital  readout  of  the  altitude  or 
airspeed,  path  guidance  information  like  glide  slope,  or  localizer  symbology  (Wickens  & 
Long,  1995).  It  should  be  noted  that  according  to  this  definition,  even  traditional  HUDs 
have  some  conformal  symbology  (e.g.,  a  horizon  line).  In  contrast,  symbols  representing 
VSI,  airspeed,  distance,  and  altitude  etc.  are  usually  non-conformal. 

Experiments  examining  conformal  symbology  in  HUDs  have  yielded  promising 
results.  For  example,  research  by  Foyle,  Stanford  and  McCann  (1991;  see  also  McCann 
&  Foyle,  1994)  has  shown  that  when  using  conformal  symbology  pilots  are  able  to 
maintain  altitude  and  follow  a  flight  path  without  significant  trade-offs  in  performance. 

In  these  studies,  altitude  symbology  was  rendered  conformal  by  placing  the  symbology 
on  virtual  buildings  along  the  flight  path.  In  contrast,  when  the  altitude  indicator  was 
superimposed  onto  the  path  (non  conformal)  the  task  of  maintaining  altitude  reduced 
flight  path  performance. 

Varying  the  form  of  the  conformal  symbology  does  not  seem  to  diminish  the 
enhanced  performance.  McCann  and  Foyle  (1995)  and  Shelden,  Foyle  and  McCann 
(1997)  have  shown  evidence  for  the  same  benefit  of  conformal  symbology  over  non- 
conformal  symbology  regardless  of  whether  the  form  of  the  symbology  was  analog 
("clockface")  or  digital.  These  experiments  are  quite  promising  and  suggest  that  the 
conformal  character  of  the  HUD  symbology  presumably  enables  parallel  processing  of 
information  from  the  two  domains.  In  accord  with  an  object-based  hypothesis,  conformal 
symbology  might  allow  for  the  creation  of  a  single  far  domain  (or  object  layer)  of 
information.  On  this  view,  performance  is  enhanced  because  the  pilot  is  able  to  allocate 
attention  to  the  far  domain  without  the  need  to  switch  to  the  near  domain. 

Although  the  Foyle,  Stanford  and  McCann  (1991),  McCann  and  Foyle  (1994; 
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1996)  and  the  Shelden,  Foyle  and  McCann  (1997)  research  on  conformal  symbology  has 
yielded  promising  results,  the  experimental  designs  that  have  been  used  in  this  research 
are  flawed.  In  particular,  adding  elements  onto  the  flight  path,  whether  those  elements 
are  virtual  buildings  or  numbers,  increases  the  number  of  cues  that  the  pilot  can  use  to 
control  the  aircraft's  flight  path.  Accordingly,  the  enhancement  in  timesharing  the  flight- 
path  task  and  the  symbology-based  tasks  may  not  be  due  to  a  reduced  requirement  to 
switch  attention  across  domains.  Instead,  this  enhanced  dual-task  performance  may  be 
attributed  to  the  reduced  load  associated  with  controlling  the  aircraft's  flight  path  when 
more  path  cues  are  present. 

In  another  experiment,  Martin-Emerson  and  Wickens  (1997)  tested  the  difference 
between  conformal  and  non-conformal  HUDs  using  symbology  that  differed  only  in 
terms  of  path  guidance  information.  Both  conditions  included  non-conformal  symbology 
such  as  VSI,  heading,  speed,  and  distance.  In  the  conformal  condition,  a  virtual  runway 
overlaying  the  actual  runway  provided  path  guidance.  In  the  non-conformal  condition, 
path  guidance  was  represented  by  a  localizer  and  a  glide  slope,  a  fixed  aircraft  symbol 
and  a  reference  line.  The  subjects’  task  was  to  approach  to  land  under  different  visibility 
conditions.  The  results  showed  that  for  the  non-conformal  condition  there  was  large 
variance  in  lateral  tracking  errors  depending  on  visibility.  In  comparison,  when  pilots 
used  the  conformal  symbology  the  lateral  tracking  errors  were  undifferentiated  across  the 
different  levels  of  visibility.  For  the  vertical  tracking  errors  symbology  had  no  effect. 
What  makes  these  results  interesting  is  that  if  the  problem  with  non-conformal 
symbology  is  that  pilots  cannot  process  information  in  both  near  and  far  domains  in 
parallel,  then  conformal  symbology  should  have  shown  benefits  for  both  vertical  and 
lateral  tracking.  There  are,  of  course,  important  differences  between  vertical  and  lateral 
tracking  tasks.  Lateral  tracking  is  more  difficult  and  often  involves  turbulence.  It  is 
possible  that  the  benefit  for  lateral  tracking  when  pilots  used  conformal  symbology 
reflects  a  more  intuitive  method  of  control. 

In  sum,  the  use  of  conformal  symbology  is  promising  in  that  it  reduces  the 
performance  tradeoffs  found  with  HUDs.  It  is  important  to  note  that  the  object-based 
attention  hypothesis  can  be  applied  to  the  Martin-Emerson  and  Wickens  (1997)  results 
although  not  necessarily  under  the  assumption  that  conformal  symbology  fuses  the  two 
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domains  into  a  single  perceptual  group.  For  example,  it  could  be  argued  that  conformal 
symbology  leads  to  better  performance  because  the  symbology  forms  a  coherent  object  in 
and  of  itself.  In  accord  with  the  object-based  attention  hypothesis,  a  sense  of  objectness 
presumably  makes  it  easier  for  pilots  to  attend  and  process  information  within  the 
conformal  symbology.  That  is,  the  advantage  of  confonnal  symbology  may  not  be  due  to 
the  notion  that  the  symbology  is  integrated  into  the  external  scene,  as  is  commonly 
assumed,  but  instead  attributed  to  the  facilitory  effects  of  object-based  attention. 

1.5  Visual  Clutter 

The  object-based  attention  framework  applies  not  only  to  the  issue  of 
simultaneously  attending  to  the  near  and  far  domains,  it  is  also  implicated  in  the  ■ 
processing  of  symbology  within  a  particular  domain.  Individual  elements  in  cluttered 
displays  are  more  difficult  to  locate,  attend,  and  interpret.  As  a  result,  attending  to  one 
particular  element  is  often  accomplished  at  the  cost  of  interference  form  other  elements 
(Martin-Emerson  &  Wickens,  1997).  It  might  be  the  case  that  making  a  particular 
element  more  object-like  makes  it  easier  to  attend  to  that  element  and  filters  out 
interference  from  neighbouring  elements.  Also,  the  sheer  number  of  symbols  in  many 
symbology  sets  can  make  it  difficult  to  integrate  relevant  information  coming  from 
various  elements,  or  to  organize  the  elements  so  as  to  optimise  the  visual  interrogation  of 
these  elements.  Principles  of  perceptual  grouping  might  facilitate  the  organization  of  a 
cluttered  display  into  usefully  related  sub-groups. 

Integrating  conformal  symbology  into  the  external  scene,  which  has  been 
suggested  as  a  potential  solution  to  the  tunnelling  problem,  can  reduce  the  level  of  clutter 
in  the  near  domain,  but  in  turn  will  increase  the  number  of  elements  in  the  external  scene. 
The  full  impact  of  transferring  elements  from  the  near  to  the  far  domain  is  unknown.  As 
noted  above,  studies  have  shown  that  there  is  a  similar  delay  in  responding  to  unexpected 
runway  intrusions  regardless  of  whether  the  symbology  is  conformal  or  non-conformal 
(Martin-Emerson  &  Wickens,  1997;  Wickens  &  Long,  1995).  Foyle  et  al,  (1993)  showed 
that  locating  an  altitude  indicator  directly  on  the  flight  path  and  in  the  centre  of  the  field 
of  view,  resulted  in  a  performance  trade-off  between  maintaining  a  set  altitude  and 
staying  close  to  a  flight  path.  When  the  altitude  indicator  was  located  above  and  to  the 
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side  of  the  flight  path  there  was  no  performance  trade-off:  pilots  were  able  to  process  the 
HUD  altitude  symbology  and  follow  the  flight  path  successfully. 

In  sum,  the  object-based  hypothesis  provides  a  framework  that  can  be  applied  not 
only  to  the  problem  of  dividing  attention  between  the  two  domains  but  also  to  investigate 
the  notion  that  infonnation  within  a  domain  can  be  processed  and  integrated  more 
effectively. 

1.6  Summary 

Object-based  attention  has  been  implicated  in  research  aiming  to  explain 
problems  associated  with  information  processing  within  HUDs  and  HMDs.  These 
explanations  have  generally  been  predicated  on  the  assumption  that  the  HUD  symbology 
(the  near  domain)  forms  one  perceptual  group,  and  the  outside  scene  (the  far  domain) 
forms  another  group.  Thus,  the  problem  of  dividing  attention  between  the  near  and  far 
domains  has  been  explained  by  claiming  that  one  of  the  domains,  usually  the  symbology 
set,  captures  attention  at  the  expense  of  the  other,  a  phenomenon  referred  to  as  cognitive 
tunnelling.  Similarly,  the  presumed  benefits  of  conformal  symbology  have  been 
explained  by  claiming  that  conformal  symbology  allows  the  fusion  of  the  near  and  far 
domains  into  a  single  perceptual  group. 

Although  the  object-based  attention  hypothesis  has  received  support  from  basic 
research  on  attention,  the  h3^othesis  is  still  relatively  undeveloped.  Therefore  the 
application  of  the  hypothesis  within  the  aviation  literature  has  been  problematic.  One 
problem  is  that  there  appears  to  be  a  tacit  assumption  in  the  literature  that  the  only 
‘objects’  that  attention  selects  are  the  near  and  far  domains;  the  study  of  object-based 
effects  within  a  domain  has  so  far  been  neglected.  For  instance,  it  is  possible  that  the 
benefits  accruing  from  the  use  of  conformal  symbology  may  be  due  to  the  symbology 
forming  more  coherent  visual  objects,  rather  than  to  a  presumed  fusion  of  the  near  and  far 
domains.  Similarly,  the  problem  of  visual  clutter  in  symbology  sets  might  be  best  studied 
by  examining  how  users  group  the  symbols  within  the  set.  Indeed,  the  assumption  that  the 
display  contains  two  perceptual  domains  has  added  little  to  our  understanding  of  visual 
clutter.  Moreover,  the  study  of  these  issues  is  complicated  by  a  number  of  confounds.  For 
instance,  the  McCann,  Foyle  and  Johnson  (1993)  study  shows  that  object-based  factors 
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can  easily  be  confounded  with  spatial  cueing.  Similarly,  object-based  factors  can  be 
confounded  with  the  informativeness  of  the  symbology,  as  is  the  case  for  the  use  of 
scene-linked  symbology  in  the  study  by  Foyle,  Stanford  and  McCann  (1991). 

To  a  large  degree,  these  problems  are  due  to  the  fact  that  the  definition  of ‘object’ 
and  ‘perceptual  group’  is  vague;  this  has  prevented  a  proper  account  of  the  distinction 
between  spatial  and  object-based  factors,  and  of  their  interaction,  in  visual  attention. 
Precisely  what  these  concepts  mean  in  the  context  of  the  attentional  problems 
experienced  by  HUD  users  needs  to  be  more  fully  articulated  and  examined.  Perceptual 
groupings  or  objects  are  typically  defined  by  Gestalt  grouping  principles  of  motion,  . 
colour,  proximity,  closure  and/or  figure  ground.  The  instantiation  of  Gestalt  grouping 
principles  during  the  use  of  dynamic  displays  (e.g.,  HUDs)  has  been  relatively 
unexplored  and  several  important  questions  concerning  perceptual  grouping  need  to  be 
addressed.  For  example,  it  is  not  clear  whether  there  is  a  hierarchy  of  grouping  principles 
where  one  perceptual  grouping  principle  overrides  others.  If  a  hierarchy  exists,  is  it 
stable  or  fluid?  With  complex  displays,  it  is  not  known  whether  a  single  grouping 
principle  is  sufficient  for  forming  a  single  object  or  whether  more  than  one  grouping 
principle  must  be  put  into  place.  It  is  possible  that  the  addition  of  a  second  grouping 
factor  to  a  HUD  symbol  will  enhance  the  sense  of  the  symbol’s  objectness.  It  is  not 
clear,  however,  whether  the  extent  of  a  symbol’s  “objectness”  is  related  to  degree  of 
object-based  attention  assigned  to  the  symbol.  This  is  important  to  determine  because  a 
HUD  symbol  that  is  perceived  as  a  strong  object  may  impact  on  cognitive  tunnelling. 


12 


2.  METAPHORS  AND  MODELS  OF  SPATIAL  ATTENTION 


The  framework  that  is  being  proposed  here  focuses  primarily  on  object-based 
attention  for  the  evaluation  of  the  design  of  HUD,  However  spatial  factors  like  the  effect 
of  cueing  also  play  a  role  in  the  controlling  of  attention  in  HUD.  It  is  therefore  important 
for  establishing  such  a  framework  to  not  only  understand  the  principles  that  underlie  the 
object  effect  but  also  how  object-based  and  spatial  factors  differ  and  how  they  may 
interact  in  their  control  and  maintenance  of  attention. 

2.1  Background 

Attention  is  a  fundamental  topic  in  cognitive  psychology  and  it  is  a  topic  that  has 
been  raised  extensively  in  human  factors  research.  However,  despite  the  fact  that 
numerous  models  of  attention  have  been  proposed  over  the  years,  there  is  still  much 
confusion  as  to  what  attention  is  and  what  it  does.  Clearly,  a  primary  role  for  attention  is 
that  of  selectively  enabling  processing  of  “privileged”  information.  But  this  is  by  no 
means  the  only  role.  Attention  serves  to  enhance  neural  information  processing, 
modulate  motor  responses  to  stimuli,  and  to  maintain  working  memory  and  the 
sequencing  of  cognitive  operations.  Attention  also  is  necessary  for  the  binding  together 
of  perceptual  features  into  a  single  phenomenal  object  or  percept  (Treisman,  1983; 
Treisrhan,  1988;  Treisman,  1998;  Treisman  &  Gelade,  1980).  Attention  also  allows  an 
organism  to  select  a  relevant  mental  representation  of  the  environment  to  guide  further 
action  (Tipper  &  Weaver,  1998). 

The  contrast  between  the  Treisman  (1983)  and  the  Tipper  and  Weaver  (1998) 
views  is  interesting.  While  both  assume  that  attention  operates  on  mental  representations, 
Treisman  contends  that  attention  is  necessary  to  form  a  coherent  representation  of 
something,  while  Tipper  and  Weaver  take  some  form  of  completed  representation  for 
granted  and  assume  that  attention  selects  a  given  representation.  This  contrast  is  very 
informative,  because  it  illustrates  one  of  the  fundamental  tensions  in  the  attention 
literature:  does  attention  serve  the  role  of  integrating  and  processing  perceptual 
information  into  coherent  representations,  or  does  it  simply  select  “pre-formed” 
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representations  for  further  processing  such  as  decision-making  and  action?  This  has  been 
expressed  in  the  literature  in  many  ways,  the  most  prominent  being  the  debates  on  early 
vs.  late  selection  and  on  spatial  vs.  object  based  attention.  In  both  of  these  debates,  one 
camp  assumes  that  perceptual  processing  (i.e.  the  creation  of  percepts  that  later  become 
representations  of  the  environment  through  object  recognition)  requires  attention  to 
happen;  this  is  the  case  for  the  early  selection  and  the  spatial  models  of  cognition.  The 
other  camp  assumes  that  some  degree  of  perceptual  processing  (shape  identification, 
object  recognition)  has  occurred  before  attention  is  directed  to  visual  stimuli,  and  that 
attention  simply  serves  the  purpose  of  selecting  one  of  these  representations;  the  late 
selection  and  the  objeet-based  attention  send  to  fall  into  this  category. 

2.2  Models  Of  Spatial  Attention 

Assumptions  about  cognition  and  attention  have  been  combined  to  produce 
several  different  metaphors  of  spatial  attention.  These  metaphors  get  cashed  out  as  a 
variety  of  different  models  by  various  researchers.  However,  it  is  useful  to  review  the 
metaphors  as  each  one  illustrates  a  basic  set  of  more-or-less  orthogonal  assumptions  that 
have  informed  the  more  detailed  models.  Moreover,  as  noted  below,  the  “fit”  between 
the  notion  of  object  and  spatial  attention  depends  on  the  particular  metaphor  of  spatial 
attention  that  one  adopts.  This  fit  will  impact  on  the  framework  that  is  used  to  research 
and  develop  HUD  symbology. 

2.2.1  Attention  as  a  filter 

Broadbent’s  (1958)  filter  model  is  one  of  the  earliest  attentional  metaphors. 
Broadbent  drew  inspiration  mainly  from  research  into  the  auditory  system  and  from 
assumptions  underlying  information  theory.  Indeed,  the  filter  model  was  an  attempt  to 
apply  information  theory,  as  developed  by  Shannon  (1938),  to  the  auditory  system. 
Broadbent’s  original  filter  model  had  the  following  features; 

•  Attention  is  viewed  as  a  structure  of  the  cognitive  system  that  filters  out 
information.  That  is,  some  information  is  let  through  the  filter  for  further 
processing,  whereas  everything  else  is  simply  discarded.  What  is  let  through 
and  what  is  discarded  is  determined  solely  on  the  basis  of  the  physical 
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characteristics  of  the  stimulus. 

•  Filtering  operates  right  after  the  basic  physical  analysis  of  the  stimulus  but 
before  any  kind  of  conceptual  processing  takes  place  (i.e.  before  a  stimulus  is 
categorized  or  identified). 

S’  The  purpose  attention  is  to  limit  how  much  information  the  cognitive  system 
needs  to  process  at  a  given  time  -  effectively  protecting  it  from  overload  - 
because  cognition  is  seen  as  being  a  system  with  inherently  limited  resources 
and  abilities  for  coping  with  much  information  at  one  time. 

In  sum,  the  filter  metaphor  is  one  where  attention  facilitates  higher-order 
processing  by  protecting  the  system  from  “information  overload”  at  a  very  early  level. 
While  intuitively  appealing,  the  limitations  of  Broadbent’s  metaphor  became  quickly 
apparent.  First,  within  auditory  research,  experimental  evidence  showed  that  not  all 
“unattended”  stimuli  were  entirely  discarded  by  the  system.  Second,  research  has  shown 
that  cueing  of  a  particular  region  of  the  display  enhanced  processing  of  information  at 
that  location.  In  these  cases,  since  the  stimuli  to  be  processed  were  often  the  only  ones  in 
the  display,  there  was  noting  else  to  filter  out.  Yet,  responses  were  more  rapid  to  cued 
than  to  uncued  stimuli.  These  results  suggested  that  a  simple  filter  model  was  inadequate, 
that  attention  plays  a  role  in  favouring  or  enhancing  processing  at  locations  in  the  visual 
field.  Although  the  filter  model  has  been  modified  and  extended  somewhat  to  deal  with 
new  data,  the  experimental  evidence  eventually  suggested  another  metaphor;  the 
attentional  spotlight. 

2.2.2  Attention  as  a  spotlight 

There  are  two  version  of  the  spotlight  metaphor,  both  of  which  are  based  on  the 
notion  that  attention  “highlights”  a  region  of  the  visual  field.  One  version  of  this 
metaphor  is  similar  to  the  filter  metaphor  in  that  what  falls  outside  of  the  attentional 
spotlight  is  presumably  not  processed  (Posner,  1980;  see  also  Femandez-Duque  & 
Johnson,  1999).  In  the  second  version,  the  spotlight  serves  to  concentrate  attentional 
resources  to  a  particular  region  space,  thereby  enhancing  processing  at  that  location,  but 
without  completely  eliminating  processing  of  the  unattended  regions  (Downing  &  Pinker, 
1985;  Jonides,  1981). 
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In  either  version,  however,  the  spotlight  metaphor  differs  from  the  filter  metaphor 
in  several  important  respects.  First,  the  attentional  filter  is  viewed  as  a  structure  through 
which  information  must  flow.  In  contrast,  the  attentional  spotlight  is  not  a  structure  but 
rather  a  functional  enhancement  of  information  flow.  A  corollary  to  this  is  that  where  the 
filter  simply  blocks  unwanted  information,  the  spotlight  “selects”  and  enhances 
information  in  a  given  spatial  region  of  the  visual  field.  The  spotlight  does  not  set  things 
aside  but  “takes  what  it  needs”.  Second,  the  spotlight  has  a  spatial  dimension  that  the 
filter  lacks:  the  spotlight  selects  a  region  of  space,  whereas  the  filter  is  insensitive  to  the 
spatial  layout  of  information.  Third,  whereas  the  filter  is  simply  a  passive  mechanism 
that  sits  at  the  front  end  of  the  information  chain,  the  spotlight  can  be  controlled.  In 
particular,  the  spotlight  can  be  moved  to  different  parts  of  the  visual  field  and,  according 
to  some  specific  models,  the  size  and  possibly  the  shape  of  the  spotlight  can  be  changed. 
This  flexibility  raises  the  issue  of  what  controls  the  spotlight,  which  didn’t  emerge  with 
the  filter  metaphor. 

In  sura,  the  attentional  spotlight  metaphor  improves  on  the  filter  metaphor  in  that 
it  allows  for  the  notions  of  enhancement  of  information  processing  (crucial  to  cueing 
data)  and  of  a  measure  of  control  over  what  information  gets  selectively  processed.  Most 
of  the  early  spotlight  models  included  the  assumption  that  the  spotlight  can  not  be  split 
between  separate  regions  of  space.  Accordingly,  attending  across  several  regions  of 
space  required  a  serial  allocation  of  the  attentional  spotlight.  Later  versions  of  the 
spotlight  model  include  the  assumption  that  the  attentional  spotlight  can  be  split  among 
two  of  three  regions  of  space. 

2.2.3  Attention  as  a  spotlight-in-the-brain 

Does  attention  select  features  of  the  visual  field  for  further  processing,  or  does  it 
select  already-processed  representations  of  the  environment  for  forther  action?  The 
spotlight-in-the-brain  metaphor  takes  the  latter  view  (LaBerge,  1995).  This  view  does  not 
necessarily  follow  from  the  assumption  that  attention  selects  objects  rather  than  features. 
Instead,  it  is  necessary  to  make  the  additional  assumption  that  the  “objects”  being 
selected  are  somewhat  sophisticated  mental  representations  of  objects,  rather  than  simply 
low-level  perceptual  groupings  that  map  rather  directly  onto  the  visual  stimulus. 
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It  is  only  on  the  first  assumption  about  the  nature  of  the  objects  implicated  in 
attention  that  the  spotlight-in-the-brain  idea  becomes  relevant.  If  the  objects  in  question 
are  simply  parts  of  the  visual  field  grouped  and  bundled  into  perceptual  wholes,  with  very 
little  conceptual  information  attached  to  them,  then  an  attentional  system  that  is  directed 
to  them  is  for  all  intents  and  purposes  directed  to  the  outside  world.  But  if  the  “objects” 
are  abstracted  representations  of  the  environment  and  thus  not  perfectly  correlated  with 
the  visual  environment,  and  if  these  carry  conceptual  information  added  by  the  system 
(e.g.,  a  person’s  expectations),  then  the  spotlight-in-the-brain  idea  becomes  more  clearly 
separated  from  the  spotlight  metaphor. 

Whereas  the  distinction  between  the  spotlight  and  the  spotlight-in-the-brain 
metaphors  is  relatively  clear,  it  is  rarely  equally  clear  which  of  the  two  a  particular 
researcher  is  basing  their  research  on.  One  reason  for  this  Is  that  assumptions  as  to  the 
nature  of  objects  are  rarely  made  explicit.  This  can  have  rather  dramatic  implications  for 
the  research  and  design  of  HUDs.  In  particular,  the  nature  of  the  “objects”  implicated  in 
attention  (relatively  high-level  representations  with  conceptual  content  vs.  relatively  low- 
level  perceptual  groupings)  has  a  bearing  on  what  kinds  of  information  will  affect  the 
allocation  of  attention  to  different  parts  of  a  visual  scene. 

2.2.4  Attention  as  vision 

The  filter  and  spotlight  metaphors  carry  the  implicit  assumption  that  perception 
and  attention  are  separate,  with  the  implication  that  attention  is  a  modality-independent 
mechanism  that  is  outside  of  perceptual  mechanisms  but  which  nevertheless  modulates 
perception.  This  is  in  stark  contrast  to  the  attention-as-vision  metaphor,  which  proposes 
that  visual  attention  is  a  property  of  the  visual  system  itself  (presumably  with  the  added 
implication  that  each  sensory  modality  its  own  attentional  system)  (van  der  Heijden, 

1986). 

The  basic  premise  of  attention-as-vision  metaphor  is  that  visual  attention  operates 
in  a  manner  very  similar  to  low-level  vision,  and  that  both  in  fact  share  many 
mechanisms.  More  specifically,  the  focus  of  attention  is  thought  to  behave  like  the  fovea, 
and  the  movements  of  the  focus  of  attention  (i.e.  attentional  shifts)  are  thought  to  be  very 
similar  to  eye  movements  (saccades).  In  fact,  according  to  this  hypothesis,  attentional 
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shifts  and  saccades  are  programmed  by  the  same  mechanisms  (Sheliga,  Riggio,  & 
Rilzolati,  1994).  A  few  assumptions  about  the  nature  and  role  of  attention  follow  from 
this.  First,  because  the  human  fovea  cannot  be  split  it  is  assumed  that  the  focus  of 
attention  cannot  be  split.  Second,  attentional  shifts  and  ocular  saccades  are  thought  to  be 
highly  correlated.  Third,  under  the  attention-as-vision  metaphor,  attention  and  perception 
are  very  closely  linked,  operating  virtually  simultaneously.  On  this  view,  the  purpose  of 
attention  is  to  enhance  perception  and  to  guide  (target)  eye  movements. 

It’s  not  entirely  clear  how  this  attention-as-vision  metaphor  relates  to  spatial- 
based  versus  object-based  models  of  attention  because  the  metaphor  stakes  no  claim  as  to 
what  is  being  attended  to,  beyond  the  visual  scene  itself  However,  there  are  some  links 
that  can  be  made.  For  example,  the  unitary  fovea  is  reminiscent  of  the  claims  made  by 
some  spatial  attention  theorists  that  the  spotlight  cannot  be  split.  On  the  other  hand,  the 
close  link  between  perceptual  processes  and  attention  is  not  entirely  unlike  the  claims  that 
perceptual  processes  determine  what  parts  of  the  visual  field  are  attended  to  in  the  object- 
based  models,  with  the  caveat  that  most  object-based  theorists  assume  that  perception  is 
pre-attentive. 

The  attention-as-vision  metaphor  fares  relatively  well  in  the  face  of  selected 
experimental  data.  There  is  evidence  supporting  the  notion  that  saccades  and  attentional 
shifts  share  the  same  neural  machinery  (Sheliga,  Riggio,  &  Rizzolati,  1994)  and  a  number 
of  studies  suggest  that  attention  serves  as  a  targeting  system  ocular  movements  (Posner, 
1992;  Posner  et  al,  1988;  Rafal  &  Robertson,  1995).  On  the  other  hand,  research  by 
Stelmach,  Campsall  &  Herdman  (1997)  has  shown  that  the  attentional  and  vision  systems 
can  be  dissociated  to  the  extent  that  eye  movements  can  be  executed  without  a 
concomitant  shift  in  attention.  Furthermore,  the  notion  that  the  focus  of  attention  is  a 
unitary  phenomenon  like  the  fovea  has  come  under  attack  from  experimental  data,  in 
particular  from  Driver  and  Baylis  (1989).  Despite  these  attacks  the  idea  that  perceptual 
processes  and  attention  are  very  closely  coupled  and  likely  share  many  mechanisms  is 
generally  accepted  and  several  papers  suggest  that  perceptual  grouping  requires  attention 
accepted  (Mack  et  al.,  1992;  Rock  et  al.,  1992;  see  also  Ben-Av,  Sagi,  &  Braun,  1992). 
Thus,  while  the  mechanisms  of  attention  might  not  map  perfectly  the  ocular  mechanisms, 
the  notion  that  perception  and  attention  can  be  cleanly  separated,  which  has  been 
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traditionally  assumed  in  the  spatial  and  object-based  models,  is  coming  under  strain.  As 
discussed  below,  this  will  likely  have  an  important  effect  on  the  object-based  models  of 
attention. 
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3.  OBJECT-BASED  ATTENTION 


In  the  object-based  attention  approach,  Gestalt  grouping  principles  rather  than 
spatial  location  are  assumed  to  be  the  dominant  factor  in  attentional  allocation  (Kramer  & 
Jakobson,  1991;  Lavie  &  Driver,  1996).  This  approach  is  based  on  research  showing  that 
information  processing  is  facilitated  when  infomiation  is  presented  within  a  single  object' 
relative  to  when  the  same  information  is  presented  in  different  objects.  Object-based 
models  explain  this  by  assuming  that  all  elements  belonging  to  a  single  object  are 
attended  in  parallel  whereas  elements  belonging  to  different  objects  are  attended  and 
processed  serially.  As  noted  below,  an  object-based  approach  can  be  accommodated  in 
concert  with  spotlight  models  of  attention. 

3.1  Relating  Object-Based  Attention  To  Spotlight  Models  Of  Attention 

The  spotlight  metaphor  has  produced  a  number  of  models  that  make  specific 
claims  about  the  nature,  medium  and  purpose  of  attention.  In  early  spotlight  models  it 
was  assumed  that  the  information  processed  within  the  spotlight  were  simple  features  and 
spatial  properties  of  the  visual  field.  On  this  view,  attention  was  necessary  to  integrate 
these  features  into  objects.  That  is,  attention  was  seen  as  part  of  the  path  from  the 
processing  of  basic  visual  features  to  object  identification  and  recognition.  Recently, 
however,  is  has  been  hypothesized  that  the  “content”  of  the  spotlight  are  perceptual  units 
or  whole  objects.  Some  researchers  assume  that  the  purpose  of  the  spotlight  is  to  select 
specific  perceptual  units  for  further  processing  (i.e.  integration  into  a  larger  scene), 
whereas  others  (e.g.  Tipper  &  Weaver,  1998),  assume  that  attention  enables  a  person  to 
select  specific  objects  within  a  scene  in  order  to  guide  action.  The  former  view  of 
attention  has  come  to  be  known  as  the  spatial  hypothesis  of  visual  attention,  whereas  the 
latter  is  referred  to  as  the  object-based  h5p)othesis  attention. 

It  is  important  to  note,  however,  that  although  object  selection  can  be  included 
into  the  spotlight  metaphor,  the  metaphor  is  in  fact  more  compatible  with  the  spatial 
hypothesis.  This  is  because  the  spotlight  metaphor  is  usually  taken  to  imply  that  attention 
is  continuously  (even  if  not  uniformly)  distributed  within  the  spotlight,  whereas  the  data 
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favouring  an  object-based  view  suggests  that  attention  is  relatively  discretely  distributed 
between  objects,  and  is  not  allotted  to  areas  between  objects.  As  noted  by  Femandez- 
Duque  and  Johnson  (1999),  the  metaphor  which  best  fits  the  object-based  view  is 
“attention-as-spotlight-in-the-brain.”  To  this  end,  the  distinction  between  the  different 
nature  of  the  “objects”  implicated  in  attention  also  serv’-es  to  illustrate  the  fundamental 
difference  between  the  spotlight  and  the  spotlight-in-the-brain  metaphors.  Whereas  the 
former  is  usually  taken  to  imply  that  the  role  of  attention  is  to  integrate  primitive 
perceptual  units  (features  or  objects)  into  more  complex  representations,  the  latter  claims 
that  the  role  of  attention  is  not  to  enhance  perception  per  se  but  rather  to  operate  on  the 
“output”  of  perception  to  direct  action  (decision-making  and  movement).  Thus,  the 
spotlight  metaphor  draws  a  tight  link  between  attention  and  information  processing, 
whereas  the  spotlight-in-the-brain  metaphor  draws  a  tight  link  between  attention  and 
action. 

Importantly,  there  is  nothing  in  the  either  the  spotlight  or  the  spotlight-in-the- 
brain  metaphors  that  rule  out  the  idea  that  “objects”  of  some  sort  (e.g.  perceptual  wholes 
that  are  more  than  mere  features)  are  contents  of  the  spotlight.  Indeed,  some 
investigators  of  object-based  attention  have  noted  that  an  attentional  spotlight  plays  an 
important  role  in  object-based  attention  (cf  Lavie  &  Driver,  1996).  As  noted  by  Yantis 
(1998,  p.l87),  an  attentional  spotlight  may  be  “responsible  for  selecting  important  or 
task-relevant  objects  for  further  detailed  visual  processing  such  as  identification”. 
However,  other  researchers  have  explicitly  make  the  claim  that  attention  selects 
representations  of  the  environment,  and  not  merely  parts  of  the  visual  field,  to  control 
action  (cf.  Tipper  &  Weaver,  1998).  Complicating  matters,  Yantis  states  that 
“perceptual  organization  mechanisms  create  object  representations  from  the  fragmented 
early  visual  image;  visual  attention  selects  one  or  more  of  these  for  delivery  to  high-level 
mechanisms”  (Yantis  1998,  p.  188).  Thus,  the  correlation  between  object-based  attention 
models  and  the  spotlight-in-the-brain  metaphor  isn’t  so  clear.  Some  proponents  of 
object-based  attention  have  models  more  in  line  with  the  simple  spotlight  metaphor, 
whereas  others  seem  more  inclined  towards  the  spotlight-in-the-brain  view. 
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3.2  Direct  Experimental  Support  For  The  Object-Based  View 

Early  support  for  object-based  attention  effects  came  from  experiments  reported 
by  Duncan  (1984)  and  Treisman  et  al.  (1983).  Duncan  presented  subjects  with  two 
overlapping  objects,  a  box  and  a  line  drawn  through  it  diagonally.  The  two  objects  could 
vary  on  two  dimensions:  the  box  could  be  small  or  large  and  have  a  gap  in  its  right  or  its 
left  edge;  the  line  was  either  dotted  or  dashed  (texture)  or  tilted  to  the  left  or  to  the  right 
(orientation).  The  subjects’  task  was  to  identify  the  two  attributes  on  either  the  line  or  the' 
box  or  one  attribute  on  each  object.  The  results  showed  that  subjects’  identification  was 
more  accurate  when  the  two  attributes  were  located  on  a  single  object  compared  to  when 
one  attribute  was  located  on  one  object  and  another  attribute  was  located  on  the  other 
object.  According  to  spatial  models,  attention  is  directed  to  a  location  in  space 
independently  of  any  structure  in  the  visual  field.  Given  that  the  whole  display  was  less 
than  1  °  of  visual  angle  it  would  be  hard  for  the  space-based  models  to  explain  the  same- 
object  effect  found  in  Duncan’s  experiment.  On  the  other  hand,  the  results  are  easily 
accounted  for  by  the  object-based  hypothesis.  When  the  two  attributes  to  be  identified 
belong  to  different  objects,  subjects  must  shift  attention  from  one  object  to  the  other 
taking  additional  time  and  reducing  accuracy. 

Treisman  et  al.  (1983)  obtained  a  similar  cost  in  performance  when  targets  to  be 
identified  belonged  to  different  objects.  Subjects  were  presented  with  a  rectangular  frame 
and  a  word,  which  were  configured  in  one  of  two  ways.  In  one  configuration,  the  frame 
and  the  word  were  presented  apart  (above  and  below  a  fixation  point)  representing  two 
distinct  objects.  In  the  other,  the  word  was  presented  within  the  frame,  forming  a  single 
object.  In  both  cases  the  distance  between  the  outline  of  the  frame  and  the  word  was  1°  of 
visual  angle.  The  subjects’  task  was  to  read  the  word  and  to  judge  the  location  of  a  gap  in 
the  frame.  The  gap  was  always  located  the  same  distance  from  the  word.  The  Treisman 
et  al.  results  were  clear:  performance  was  significantly  facilitated  when  the  word  was 
presented  within  the  frame  (presumably  forming  a  single  perceptual  object)  compared  to 
when  the  word  and  the  frame  were  separate. 

Both  of  these  experiments  strongly  suggest  that  attention  is  allocated  to  perceptual 
objects  in  the  visual  field.  Subsequent  research  has  further  supported  this  object-based 
claim.  For  example.  Goldsmith  (1998)  showed  that  visual  search  is  easier  when  features 
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are  linked  to  the  same  object  than  when  they  belong  to  different  objects.  Similarly, 
Duncan  and  Nimmo-Smith  (1996)  found  that  it  is  harder  for  subjects  to  discriminate 
between  features  that  belong  to  different  objects  compared  to  a  situation  requiring 
discrimination  between  features  that  belong  to  a  single  object. 

Driver  and  Baylis  (1989)  reported  a  groundbreaking  experiment  on  object-based 
attention,  which  was  based  on  evidence  previously  interpreted  as  strong  support  for  a 
spatial  attention  hypothesis.  This  experiment  examined  response  competition  where 
interference  of  distractors  on  target  response  decreases  in  relation  to  increased  spatial 
distance  (Eriksen  &  Eriksen,  1974).  Driver  and  Baylis  used  the  Eriksen  and  Eriksen 
paradigm  (subjects  responded  to  a  central  letter  located  in  an  array  of  five  letters)  but 
grouped  the  target  and  distractors  together  using  common  motion.  When  the  outer  letters 
moved  with  the  target  they  interfered  more  with  target  identification  than  the  nearby 
letters  that  had  remained  stationary.  This  demonstrated  how  a  seemingly  spatial  effect 
breaks  down  when  the  target  and  the  distractors  are  grouped  together.  Using  one  of  the 
Gestalt  principles  (common  motion)  the  distant  distractors  were  grouped  with  the  target 
and  as  such  produced  more  interference  than  distractors  located  closer  to  the  target.  The 
spatial  models  cannot  account  for  these  results,  as  the  basic  claim  of  the  space-based 
hypothesis  is  that  attention  is  allocated  to  contiguous  regions  in  space,  and  that 
everything  within  such  a  region  gets  processed. 

Baylis  and  Driver  (1992)  extended  their  work  and  showed  that  other  grouping 
principles  (than  motion)  can  overwrite  the  spatial  (proximity)  effect.  Subjects  were 
required  to  respond  to  a  central  target  (letter)  while  ignoring  distractors  located  either 
near  or  far  away  from  the  target.  In  accord  with  the  object-based  attention  hypothesis, 
when  the  distant  distractors  shared  a  colour  with  the  target  they  interfered  more  with 
responses  than  distractors  that  were  closer  to  the  target.  Also,  as  would  be  predicted 
based  on  Gestalt  grouping  principles,  good  continuation  between  the  target  and  the 
distractors  also  resulted  in  more  interference  regardless  of  distance. 

Kramer  and  Jacobson  (1991)  used  a  variation  of  the  response  competition 
paradigm  (Eriksen  &  Eriksen,  1974)  to  test  the  object  effect  in  a  focused  attention  task. 
Subjects  judged  whether  the  target  (a  line)  was  dashed  or  dotted  while  ignoring 
distractors.  Distracting  lines  (compatible  or  incompatible  with  target)  were  located  to  the 
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left  or  right  of  the  target.  The  distractors  could  be  grouped  according  to  the  Gestalt 
principles  with  the  target,  or  they  coUld  form  a  part  of  a  different  object.  The  distance 
between  the  target  and  the  distractors  was  kept  constant.  According  to  the  space-based 
models  there  should  be  no  difference  between  the  different  conditions,  as  the  spatial 
separation  was  constant.  On  the  other  hand,  according  to  the  object-based  model  there 
should  be  less  interference  when  the  distractors  belong  to  a  separate  object.  As  predicted 
by  the  object-based  model,  the  interference  from  distractors  was  indeed  drastically 
reduced  or  eliminated  when  the  distractors  and  the  target  belonged  to  different  objects. 
Further,  when  the  incompatible  distractors  formed  part  of  the  same  object  as  the  target, 
reaction  time  and  accuracy  was  significantly  reduced  compared  to  when  the  compatible 
distractors  belonged  to  the  same  object  as  the  target. 

In  sum,  it  is  quite  apparent  that  the  response  competition  that  is  produced  by 
distractors  as  seen  in  the  experiments  mentioned  above  cannot  be  explained  by  referring 
to  the  spatial  hypothesis.  Grouping  the  distractors  and  the  target  together  by  colour  or 
good  continuation  causes  significantly  more  interference  with  target  response  than 
distractors  that  are  easily  separated  from  the  target.  Spatial  distance  seems  to  have  no 
influence  here.  This  suggests  that  visual  attention  is  directed  to  perceptual  objects  in  the 
visual  field  that  are  segmented  according  to  the  Gestalt  grouping  principles. 

Experimental  evidence  supporting  object-based  attention  has  also  been  adduced 
from  lOR  tasks.  lOR  was  originally  interpreted  as  supporting  spatial  models  of  attention, 
as  certain  locations  in  space  are  prevented  from  being  constantly  re-examined.  However, 
if  attention  is  object-based  then  the  inhibitory  mechanism  should  be  directed  towards 
structure  in  the  visual  field  rather  than  location.  To  this  end,  recent  evidence  suggests 
that  lOR  is  related  to  perceptual  objects  in  the  visual  field  rather  than  spatial  location 
(Tipper,  Driver  &  Weaver,  1991).  Jordan  and  Tipper  (1998)  used  a  static  display  to 
examine  the  difference  between  cueing  location  vs.  cueing  visible  objects  in  the  display. 
The  display  consisted  of  black  “pacmen”  (discs  with  a  quadrant  missing)  and  lines.  In 
one  condition  objects  formed  by  illusory  contours  (kaniza  squares)  were  visible  whereas 
in  another  condition  no  such  objects  were  visible.  The  lOR  effect  was  much  larger  when 
an  object  was  cued  compared  to  when  only  a  location  was  cued. 
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3.3  Limitations  In  Experiments  On  The  Object-Based  Effect 

Although  there  is  a  growing  body  of  experimental  evidence  suggesting  that  visual 
attention  interacts  with  perceptual  grouping  mechanisms,  there  are  problems  concerning 
the  interpretation  of  these  results.  The  main  problem  is  that  spatial  proximity  of  elements 
in  a  display  often  correlates  highly  with  those  elements  forming  a  perceptual  whole 
(structure).  Therefore  data  apparently  supporting  an  object-based  model  can  very  often 
be  interpreted  according  to  a  spatial  model,  and  vice  versa.  What  is  needed  is  an 
experimental  paradigm  that  allows  for  a  clearer  distinction  between  object-based  and 
spatial  effects. 

A  second,  and  related  problem  in  studies  attempting  to  differentiate  between 
spatial  and  object-based  models  of  attention  is  that  often  the  whole  display  is  so  small 
that  it  is  hard  to  rule  out  the  possibility  that  any  object  effect  might  be  happening  within  a 
larger,  location  based  representation.  For  example,  in  Duncan’s  (1984)  experiment  the 
whole  display  was  less  than  1°;  that  obviously  allows  for  only  a  very  small  and 
potentially  insignificant  variation  in  spatial  distances.  It  could  be  argued  therefore  that 
object  based  factors  are  only  important  within  an  attentional  spotlight.  Similarly  with  the 
paradigm  used  by  Treisman  et  al,  (1983),  the  spatial  area  relevant  when  the  word  was 
presented  within  the  frame  was  much  smaller  than  the  area  relevant  when  the  word  and 
the  frame  were  presented  separately.  Therefore  Treisman  et  al.’s  results  might  have 
reflected  the  benefit  of  closeness  in  space.  Also  the  onset,  of  the  frame  may  have 
exogenously  grabbed  attention  making  it  difficult  to  read  the  word  in  the  condition  where 
the  frame  and  word  were  presented  separately. 

Another  problem  in  the  literature  is  that  the  targets  are  often  so  different  that  the 
same-object  effect  can  be  the  result  of  some  artefact  (e.g.,  spatial  frequency  differences). 
In  Duncan’s  (1984)  experiments,  the  two  attributes  of  the  line  are  available  at  a  high 
spatial  fi-equency  whereas  the  two  attributes  of  the  box  are  available  at  low  spatial 
frequency.  The  results  may  therefore  reflect  difficulties  in  processing  or  attending  to 
different  spatial  frequencies.  Furthermore,  the  targets  used  and  the  instructions  given 
often  indicate  objects  prior  to  the  actual  task.  To  wit,  Duncan’s  procedure  in  which 
subjects  were  required  to  identify  the  height  of  a  box  and  the  texture  of  a  line  may  bias 
subjects  toward  processing  the  stimuli  as  objects. 
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The  aforementioned  limitations  highlight  the  need  to  (a)  establish  a  better 
experimental  paradigm  that  allows  for  a  clear  assessment  of  object-based  attention  effects 
and  (b)  acquire  converging  evidence  for  object-based  attention  from  other  research 
domains  and 

3.4  A  Better  Experimental  Paradigm:  Lavie  And  Driver  (1996) 

As  noted  above,  many  of  the  paradigms  that  have  been  used  to  examine  the 
effects  of  object-based  attention  have  generated  results  that  are  open  to  alternative 
explanations.  One  experimental  paradigm  that  appears  to  provide  a  relatively  pure  index 
of  object-based  attention  is  that  developed  by  Lavie  and  Driver  (1996).  Lavie  and 
Driver’s  paradigm  allowed  for  testing  the  difference  between  space  and  object-based 
models  while  avoiding  many  of  the  potential  problems  discussed  above.  Accordingly, 
the  Lavie  and  Driver  paradigm  was  used  as  a  foundation  for  the  present  experiments. 

The  Lavie  and  Driver  (1996)  paradigm  was  designed  based  on  four  main  criteria. 
The  first  criterion  was  to  manipulate  the  division  between  the  objects  over  a  wide  spatial 
area.  This  allows  the  examination  of  whether  the  object  effect  is  only  valid  within  a 
spatial  “spotlight”  and  to  measure  both  space  and  object-based  effect  at  the  same  time. 

The  second  criterion  was  to  make  the  target  stimuli  the  same  both  within  and  between 
objects.  This  would  rule  out  any  possible  artefacts  such  as  spatial  frequency  differences 
etc.  The  third  criterion  was  to  create  a  task  that  was  neither  object  dependent  nor  object 
independent.  That  is,  the  targets  to  be  identified  should  form  a  natural  part  of  the  object 
without  their  identification  in  any  way  implying  objectness.  The  final  criterion  was  to 
keep  eccentricity  and  acuity  equal  for  all  targets.  Also,  Lavie  and  Driver  made  sure  that 
the  instructions  to  the  participants  and  the  task  requirements  would  in  no  way  imply 
objects  or  location  prior  to  the  actual  task  performance. 

The  paradigm  used  by  Lavie  and  Driver  (1996)  consisted  of  two  dashed  lines 
presented  together  briefly.  One  line  was  presented  in  green  and  the  other  one  in  red. 

Each  line  formed  an  object  according  to  the  principles  of  good  continuation  and  grouping 
by  colour.  The  whole  display  subtended  13°  of  visual  angle.  The  subject’s  task  was  to 
respond  as  quickly  as  possible  to  targets  that  appeared  on  the  lines.  The  two  targets  could 
either  be  a  small  dot  replacing  one  of  the  dashes  within  a  dashed  line,  or  a  gap  (i.e.  one  of 
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the  dashes  removed).  Subjects  responded  by  identifying  (button  push)  whether  the  target 
were  the  same  (two  dots/two  gaps)  or  different  (a  gap  and  a  dot).  There  were  three 
conditions,  the  near  condition  (the  targets  appeared  close  together  in  space  but  on 
different  lines),  the  far  condition  (the  target  appeared  far  away  in  space  and  on  different 
lines)  and  the  object  condition  (the  targets  appeared  far  apart  but  on  the  same  line). 
According  to  the  spatial  hypothesis  the  fastest  response  should  occur  under  the  near 
condition  where  the  two  targets  are  close  together  in  space  but  located  on  different  lines. 
There  should  also  be  no  significant  difference  between  the  far  and  the  object  condition  as 
the  spatial  separation  in  both  cases  is  similar.  On  the  other  hand,  according  to  the  object- 
based  hypothesis  the  fastest  reaction  time  should  occur  within  the  object  condition  (far 
apart  but  on  same  line). 

The  results  showed  a  clear  object  advantage  thereby  supporting  the  object 
hypothesis  over  the  spatial  hypothesis  and  establishing  the  paradigm  as  a  valid  tool  for 
testing  the  object  effect.  Responses  were  significantly  faster  and  more  accurate  under  the 
object  condition  compared  to  the  near  and  far  condition.  Furthermore,  there  was  no 
significant  difference  in  responses  to  the  targets  in  the  far  and  near  conditions.  The  Lavie 
and  Driver  (1996)  object  effect  seemed  robust.  Even  when  the  proportion  of  the  near 
trials  (targets  close  together  but  on  separate  line)  was  doubled  there  was  still  a  significant 
advantage  for  the  object  condition  compared  to  the  other  two  conditions.  Similarly, 
equating  the  luminance  of  the  lines  and  presenting  the  targets  as  either  short  of  long  white 
dashes  had  no  impact  on  the  object  effect. 

3.5  Converging  Evidence:  Neuropsychological  Research  On  Object-Based  Attention 

Studies  on  patients  suffering  from  neurological  trauma  (accidents,  strokes,  tumors 
etc.)  provide  information  that  further  clarifies  the  interaction  between  perception  and 
attention.  In  particular,  these  studies  impact  on  the  issue  of  the  nature  of  the  neural 
representations  associated  with  attention.  As  summarized  below,  neuropsychological 
evidence  strongly  supports  the  notion  that  attention  is  object-based  (Egly,  Driver  &  Rafal, 
1994;  Humphreys  &  Riddoch,  1993;  Robertson  &  Rafal,  2000). 

The  most  common  neurological  condition  implicating  attention  is  unilateral 
neglect  (Palmer,  1999;  Humphreys  &  Riddoch,  1993).  Unilateral  neglect  is  due  to  a 
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brain  lesion  in  the  parietal  lobe,  most  commonly  in  the  right  hemisphere  although  lesions 
in  the  left  hemisphere  have  also  been  reported  to  cause  the  disorder.  Patients  with 
parietal  damage  fail  to  notice  stimuli  that  are  located  in  the  side  of  the  visual  field 
contralateral  to  the  brain  injury.  For  example,  if  asked  to  copy/draw  a  picture  of  an 
object  (e.g.,  clock,  ball,  house),  a  patient  with  right  parietal  damage  will  draw  only  the 
right  side  of  that  object  (Palmer,  1999). 

Unilateral  neglect  has  commonly  been  viewed  as  supporting  space-based  views  of 
attention.  However  recent  research  shows  that  unilateral  neglect  is  due  to  a  difficulty  in 
disengaging  attention  from  a  currently  attended  object  within  the  unimpaired  side  to  a 
new  object  within  the  impaired  side  (Humphreys  &  Riddoch,  1993;  Kanwisher  &  Driver, 
1992;  Posner,  et  al  1987).  For  example,  patients  with  unilateral  neglect  can  often  detect 
objects  on  either  side  of  the  visual  field,  when  presented  with  objects  on  both  sides 
simultaneously  they  only  report  the  object  on  the  side  ipsilateral  to  the  damage 
(Kanwisher  &  Driver,  1992).  Further,  when  patients  with  unilateral  neglect  are  asked  to 
cross  out  lines  drawn  on  a  paper,  only  cross  out  the  lines  on  the  ipsilateral  right  side. 
However  when  asked  to  erase  the  lines,  the  patients  will  begin  by  erasing  the  lines  on  the 
ipsilateral  side  but  will  eventually  erase  the  ones  on  the  contralateral  side  as  well.  Since 
crossing  out  lines  does  not  remove  the  lines  from  the  visual  field,  it  has  been  suggested 
that  the  patients  are  simply  failing  to  disengage  attention  from  the  lines  they  just  crossed 
out.  On  the  other  hand,  when  the  lines  are  erased,  it  becomes  easier  to  attend  to  the  lines 
on  the  left  as  there  are  no  longer  any  lines  present  on  the  right  side  to  capture  attention 
(Palmer,  1999). 

Robertson  and  Rafal  (2000)  showed  that  if  a  patient  with  unilateral  neglect  is 
presented  with  a  target  and  distractors,  the  location  of  the  distractors  (located  on  their 
unimpaired  side  versus  both  sides)  affects  target  detection.  If  the  distractors  are  only 
located  on  the  unimpaired  (ipsilateral)  side,  target  detection  decreases  compared  to  the 
case  where  the  distractors  are  located  on  both  sides.  It  has  been  suggested  that  placing 
distractors  only  on  the  ipsilateral  side  to  the  target  moves  the  center  of  the  patient’s 
representation  of  the  display  (the  target  and  the  distractors)  further  to  the  contralateral 
side  in  an  object-based  frame  of  reference.  Hence  the  patient  shows  reduced  accuracy  and 
speed  of  performance.  This  finding  is  interesting  as  it  suggests  that  performance  is  not 
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related  to  spatial  location  in  the  visual  field,  but  instead  to  where  the  object  is  located 
within  an  array  of  objects.  That  is,  attention  is  based  on  grouped  structures  in  the  visual 
field  rather  than  on  spatial  location. 

Some  of  the  strongest  evidence  that  representations  of  objects  influence  attention 
comes  from  patients  with  Balinf  s  syndrome  (Humphreys  &  Riddoch,  1993;  Palmer, 
1999).  Balint’s  patients  cannot  see  anything  except  a  single  fixated  object  and  it  is 
extremely  difficult  for  them  to  switch  fixation  from  one  object  to  another.  Even  in  a 
complex  field  of  many  objects  they  fail  to  perceive  more  than  a  single  object.  This  is 
even  true  when  two  objects  occupy  the  same  spatial  location.  For  example,  the  patient 
can  be  looking  at  a  person’s  face  but  fails  to  see  that  the  person  is  wearing  glasses.  Also, 
when  these  patients  are  shown  two  objects  overiapping  in  space  (triangle  and  circle),  they 
can  only  report  one  object.  This  is  hard  to  account  for  with  a  spatial  attention  approach 
but  fits  well  with  the  object-based  attention  hypothesis. 

In  an  experiment  by  Humphreys  and  Riddoch  (1993)  two  patients  suffering  from 
Balint’s  syndrome  were  presented  with  circles  of  different  colours.  The  circles  were 
either  green  or  red  and,  the  patient’s  task  was  to  detect  whether  the  circles  had  the  same 
colour  or  whether  they  had  different  colours.  There  were  three  conditions,  random, 
single-object  and  mixed-object.  In  the  random  condition  black  lines  were  placed 
randomly  in  between  the  coloured  circles.  In  the  single-object  condition,  the  same  black 
lines  connected  circles  of  same  colour  together,  thereby  forming  a  single  object.  In  the 
mixed-object  condition,  the  lines  connected  circles  of  different  colour  together  similarly 
forming  a  single  object.  All  three  conditions  had  circles  of  green  and  red  colours  located 
close  together,  the  difference  being  the  colour  of  the  circles,  which  were  connected.  The 
spatial  separation  between  the  circles  varied.  The  results  showed  that  the  patients  had 
better  performance  in  the  mixed  condition  compared  to  single-object  or  random 
conditions.  If  attentional  capture  is  space-based  then  there  should  not  have  been  any 
difference  between  the  random  and  mixed-object  conditions.  In  the  random  condition 
circles  of  different  colours  were  located  closed  together,  although  not  connected  as  in  the 
mixed-object  condition.  On  the  other  hand,  if  attentional  capture  is  object-based,  as  has 
been  suggested,  the  mixed-object  condition  should  be  helpful,  since  fixating  on  a  single 
object  (circles  of  different  colours  connected)  in  this  case  gives  the  patients  all  the 
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relevant  information  for  identifying  whether  the  circles  are  the  same  or  not.  In  tlie 
random  condition  the  patients  would  fixate  on  a  single  circle,  making  it  difficult  to  attend 
to  other  circles  in  the  display.  Similarly,  in  the  single-object  condition,  attending  to  a 
single  object  (where  only  circles  of  the  same  colour  are  connected  together)  doesn’t  help 
the  patients  compare  the  colour  of  the  circles  in  the  display.  A  further  problem  for  spatial 
models  is  that  variation  in  spatial  distance  did  not  change  the  benefit  found  for  the  mixed- 
object  condition. 

Further  evidence  that  attention  is  object-based  is  found  in  electrophysiological 
studies  measuring  event-related  potentials  (ERPs).  Valdes-Sosa,  Bobes,  Rodriguez  and 
Pinilla  (1998)  measured  ERPs  with  a  paradigm  designed  to  test  object-based  effects. 
Subjects  were  presented  with  two  sets  of  dots  that  differed  in  colour.  The  sub  jects  were 
instructed  to  attend  to  either  set  of  dots  (green  or  red).  Their  task  was  then  to  detect  a 
brief  linear  displacement  within  one  set  of  dots  and  detect  the  dominant  direction  of 
movement  (only  a  subset  of  dots  moved  in  the  same  direction).  The  sudden  onset  of  the 
linear  displacement  served  to  elicit  motion  onset  ERPs.  In  one  condition,  the  two  sets  of 
dots  rotated  in  opposite  directions  creating  the  perception  of  two  transparent  surfaces 
sliding  across  each  other.  In  another  condition  the  two  sets  of  dot  remained  stationary, 
creating  the  percept  of  a  single  object.  In  a  third  condition,  the  two  sets  of  dots  rotated  in 
the  same  direction  similarly  creating  the  percept  of  a  single  object.  The  results  showed  a 
clear  difference  between  the  dual-object  and  single-object  conditions.  In  the  dual-object 
condition  there  was  a  large  difference  in  ERP  depending  on  whether  the  linear 
displacement  occurred  in  the  attended  or  unattended  set  of  dots.  In  particular,  whereas  a 
clear  ERP  was  found  when  the  linear  displacement  occurred  in  the  attended  set,  a  strong 
suppression  was  found  for  all  components  related  to  ERP  when  the  linear  displacement 
occun-ed  in  the  unattended  object.  No  difference  between  the  attended  and  unattended 
sets  was  found  in  the  single  object  condition. 

Studies  on  the  neurological  basis  of  attention  suggest  the  existence  of  two 
components  to  the  object-based  control  of  attention.  On  the  one  hand,  attention  is 
directed  to  whole  perceptual  groups  in  space.  On  the  other  hand,  evidence  strongly 
suggests  that  there  exists  a  coordinate  system  that  allocates  attention  within  more  spatial 
invariant  object-centered  frame  (Behrmann  &  Tipper  1994;  Reuter-Lorenz,  Drain  & 
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Hardy-Morais,  1996).  An  object-centered  frame  does  not  change  according  to  position  in 
the  visual  field  or  viewer  perspectives.  Research,  on  patients  with  unilateral  neglect 
supports  the  idea  of  object-centered  frame:  Patients  with  unilateral  neglect  will  often 
neglect  one  side  of  the  object  (contralateral  to  the  damage)  regardless  of  the  position  of 
the  object  in  the  patients’  visual  field.  For  example,  if  a  person  stands  in  front  of  a 
neglect  patient,  the  patient  will  not  see  the  person’s  right  arm  (the  arm  on  the  patient’s 
left  side).  However,  if  the  person  rotates  90°  to  the  left  or  right,  the  patients  still  fails  to 
see  the  right  arm,  regardless  of  the  fact  that  when  rotating  to  the  left  the  right  arm  will  fall 
within  the  patients  unimpaired  visual  area  (Robertson  &  Rafal,  2000).  Furthermore,  if  an 
object  is  placed  entirely  within  the  unimpaired  field  of  vision,  patients  with  unilateral 
neglect  still  do  not  perceive  an  aspect  of  the  object  on  the  side  contralateral  to  the  lesion. 
This  suggests  a  frame  of  attentional  control  that  is  determined  by  a  single  object  (Palmer, 
1999;  Reuter-Lorenz  et  al,  1996). 

Although  most  evidence,  suggesting  an  object-centered  frame  of  attentional 
control  comes  from  patients  suffering  from  brain  lesions,  there  is  evidence  suggesting 
that  it  also  influences  the  normal  brain.  As  noted  by  Reuter-Lorenz  et  al.  (1996),  subjects 
with  normal  brain  function  show  a  contralateral  attentional  bias  for  each  hemisphere. 

That  is,  d.etecting  a  target  located  on  the  left  side  of  an  object  in  the  left  visual  field  is 
easier  than  if  the  target  is  on  the  right  side.  The  same  benefit  is  found  for  right  side  with 
objects  in  the  right  visual  field.  As  with  the  data  from  unilateral  neglect  patients,  this 
hemispheric  data  shows  that  attention  is  allocated  on  the  basis  of  object-based 
representations  of  the  environment  in  non-patient  populations. 


3.6  Object-Based  Attention  To  Moving  Stimuli 

Most  investigations  of  object-based  attention  have  involved  the  use  of  static  or 
near-static  displays.  However,  insofar  as  HUDs  are  used  in  dynamic  visual 
environments,  research  examining  how  attention  is  assigned  to  moving  objects  is  clearly 
relevant. 

Pylyshyn  and  Storm  (1988)  tested  the  assumption,  arising  from  a  spatial  model  of 
attention,  that  people  are  able  to  track  many  elements  by  moving  their  attentional 
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spotlight  from  element  to  element  in  rapid  succession.  They  first  tested  subjects’  ability 
to  track  independently  moving  and  similar  elements  within  a  display  of  many  moving 
elements.  This  showed  that  people  are  able  to  successfully  track  about  four  or  five 
elements  for  at  least  1 0  seconds.  A  computer  simulation  of  this  task  was  developed  using 
the  actual  trajectories  that  were  shown  to  subjects  combined  with  a  model  of  a  spatial 
spotlight.  This  simulation  showed  that  an  attentional  spotlight  moving  from  element  to 
element  would  not  be  able  to  keep  track  of  the  elements.  Thus,  Pylyshyn  and  Storm 
concluded  that  a  spatial  attention  hypothesis  can  not  account  for  people’s  ability  to 
perform  multiple  object  tracking. 

There  are  two  object-based  accounts  of  how  people  might  perform  multiple 
tracking  of  objects.  One  account  is  that  the  multiple  elements  are  formed  into  a  nonrigid 
polygon,  with  each  element  being  one  of  the  vertices  of  the  polygon.  On  this  view, 

Yantis  (1992)  found  that  tracking  performance  is  affected  by  factors  that  facilitate  the 
initial  formation  and  maintenance  of  a  perceptual  group  of  elements  to  be  tracked.  Thus, 
Yantis  argues  that  specific  elements  within  a  display  of  similar  moving  elements  are 
tracked  by  grouping  the  elements  into  a  single  “superobject”  (essentially  a  nonrigid 
polygon).  The  less  polygon-like  the  nonrigid  polygon  is,  the  harder  the  task. 

Pylyshyn  and  his  colleagues  have  developed  a  different  account,  which  assumes 
that  each  element  being  tracked  is  associated  to  a  visual  index,  or  a  Finger  of 
INSTantiation  (Pylyshyn,  1989).  On  this  view,  the  early  visual  system  attaches  indexes 
to  the  individual  elements  to  be  tracked.  These  indexes  provide  a  way  for  the  visual 
system  to  pick  out  specific  elements  of  the  visual  field  by  referring  to  the  elements 
themselves,  and  not  to  any  properties  of  the  objects  (Pylyshyn,  1998).  These  indexes 
then  allow  the  rest  of  the  visual  system  to  attend  to  those  specific  elements,  to  track  them, 
to  identify  them,  and  so  on  (Scholl  &  Pylyshyn,  1999;  Sears  &  Pylyshyn,  in  press).  Thus, 
visual  indexes  are  a  kind  of  representation  or  data  structure  within  the  visual  system 
which  function  in  a  manner  analogous  to  linguistic  indexes  and  demonstratives  (e.g. 
words  such  as  “thaf  ’  or  “there”). 

In  sum,  the  ability  to  track  moving  objects  appears  to  require  that  attention  be 
object-based  at  least  to  some  degree.  Spatial  models  of  attention  are  inadequate  for 
explaining  the  experimental  evidence  obtained  from  multiple  tracking  tasks,  as  these 
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models  would  require  the  spatial  spotlight  to  visit  each  moving  element  in  rapid 
succession.  Successive  tracking  of  this  sort  would  mean  that  the  visual  system  is  able  to 
predict  the  positions  of  the  moving  elements  and  to  move  the  attentional  spotlight  at  a 
speed  that  are  beyond  the  capacities  of  the  human  visual  system. 


3.7  Summary 

A  growing  body  of  experimental  and  neuropsychological  research  supports  the 
conclusion  that  attention  is  referenced  to  perceptual  groups  or  objects  within  the  visual 
field.  This  is  known  as  the  object-based  attention  hypothesis.  The  object-based  attention 
hypothesis  provides  an  account  of  attentional  effects  in  both  static  displays  and  in 
situations  where  objects  must  be  tracked. 

The  object-based  attention  hypothesis  has  implications  for  research  and 
development  of  HUDs  and  for  the  integration  of  HUDs  into  HMDs.  For  example,  based 
on  Gestalt  principles,  perceptual  groupings  of  HUD  symbology  will  be  formed  based  on 
common  motion,  colour,  proximity,  closure  and/or  figure-ground  separation.  Object- 
based  attention  may  underlie  difficulties  associated  with  pilots’  need  to  process  near 
(HUD)  and  far  (external  scene)  domain  information:  near  and  far  domains  differ  along 
one  or  more  of  the  Gestalt  grouping  principles.  An  object-based  attention  framework, 
and  a  corresponding  paradigm  for  assessing  object-based  attention  effects,  would  be 
useful  for  gaining  a  metric  on  near  versus  far  domain  attentional  capture  and  cognitive 
tunnelling.  Such  a  paradigm  would  also  provide  a  method  for  measuring  the  impact  of 
conformal  vs.  nonconfomial  symbology. 
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SECTION  TWO:  EXPERIMENTS  ON  OBJECT-BASED 

ATTENTION 


There  is  a  growing  body  of  experimental  and  neuropsychological  research 
supporting  an  object-based  attention  hypothesis  where  attention  is  assumed  to  be 
allocated  to  objects  in  the  visual  field  (Duncan,  1984;  Kanwisher  &  Driver,  1992;  Kramer 
&  Jacobson,  1991;  Humphreys  &  Riddoch,  1993;  Valdes-Sosa,  et  al.  1998).  Research 
within  the  field  of  aviation  psychology  has  adopted  the  object-based  attention  hypothesis 
in  order  to  explain  cognitive  tunnelling  and  other  perceptual/attentional  problems  in 
HUDs.  Indeed,  as  noted  in  Section  One  of  this  report,  a  review  of  the  literature  shows 
that  the  object-based  attention  hypothesis,  provides  a  promising  framework  for  the 
systematic  study  of  the  attentional  problems  associated  with  HUDs.  However,  the  use  of 
the  object-based  attention  framework  in  the  aviation  literature  has  been  based  on  loosely 
defined  concepts  and  a  lack  of  understanding  of  how  the  object-based  hypothesis  may 
apply  to  the  dynamic  environment  of  HUDs.  For  example,  one  of  the  misconceptions  is 
that  the  only  relevant  “objects”  of  concern  for  attentional  control  are  the  near  (HUD)  and 
far  (external  scene)  domains.  As  noted  in  Section  I,  forming  coherent  objects  within  a 
single  domain  can  have  important  perceptual/attentional  consequences.  Furthermore,  the 
vague  object-based  attention  framework  that  exists  in  the  aviation  literature  makes  it 
difficult  for  researchers  to  avoid  confounding  object-based  effects  with  other  factors  that 
may  influence  performance. 

The  goal  of  the  present  research  program  was  to  further  develop  the  object-based 
attention  framework  for  HUD/HMD  applications.  A  series  of  five  experiments  are 
reported  in  this  section.  Experiments  1  -  4  were  conducted  to  (a)  establish  an  object- 
based  attention  paradigm;  to  obtain  a  sense  of  effect  sizes  and  reliability,  impact  of 
instructional  sets,  and  laboratory  constraints,  (b)  obtain  initial  evidence  regarding  what 
attributes  are  necessary  in  order  to  make  an  object,  and  (c)  examine  the  relationship 
between  object-based  attention  and  spatial  cueing.  The  Lavie  and  Driver  (1996) 
experimental  paradigm  was  used  as  a  foundation  for  the  present  experiments.  In 
Experiment  5,  the  object-based  attention  paradigm  was  extended  to  a  dynamic  display. 
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Experiment  5  represents  a  significant  step  toward  achieving  the  goal  of  examining  object- 
based  attention  effects  in  HUDs  and  HMDs. 


EXPERIMENT  1 

Establishing  An  Object-Based  Attention  Paradigm 

The  goal  of  Experiment  1  was  to  establish  an  initial  experimental  paradigm  to 
examine  effects  of  object-based  attention.  To  do  this,  the  paradigm  developed  by  Lavie 
and  Driver  (1996)  was  adopted.  This  paradigm  allows  for  the  simultaneous  manipulation 
and/or  control  of  spatial  separation  while  examining  object  factors.  In  addition,  the  Lavie 
and  Driver  paradigm  addresses  earlier  criticisms  on  studies  comparing  object  and  space- 
based  effects  on  attention.  The  display  extends  over  a  large  spatial  area  of  roughly  14°. 
The  targets  used  are  the  same  for  all  conditions  and  they  are  equally  located  from  central 
fixation  point.  Instructions  and  task  requirements  gave  no  prior  indication  of  objectness. 

The  display  consisted  of  two  dashed  lines.  Subjects’  task  was  to  identify  whether 
two  targets  (a  dot  and  a  gap)  appearing  anywhere  in  the  display  were  the  same  or 
different.  The  targets  appeared  far  apart  on  either  the  same  line  (object  condition),  far 
apart  but  on  different  lines  (far  condition),  or  they  appeared  close  together  on  different 
lines  (near  condition).  According  to  the  spatial  hypothesis,  latencies  should  be  lower  in 
the  near  than  far  condition  because  the  spatial  separation  of  the  targets  is  less  in  the 
former  condition.  The  spatial  hypothesis  also  predicts  that  latencies  should  be  generally 
equivalent  in  the  same  in  the  object  as  in  the  far  condition.  If  any  differences  are  found, 
latencies  should  be  faster  in  the  far  than  the  object  condition  because  the  spatial 
separation  of  targets  is  slightly  larger  in  the  object  condition. 

The  object-based  attention  hypothesis  states  that  it  is  faster  to  search  for  targets  on 
a  single  object  (one  line)  than  targets  presented  on  two  different  objects  (two  lines) 
(Duncan  1984;  Kramer  &  Jacobson,  1991).  Accordingly,  the  object-based  prediction  is 
that  responses  should  be  faster  in  the  object  than  the  far  condition.  Faster  responses  to 
targets  in  the  object  than  the  near  condition  would  represent  strong  evidence  for  an  object 
effect.  This  pattern  would  show  an  effect  of  object-based  attention  that  transcends 
differences  in  spatial  separation:  targets  are  much  closer  in  the  near  than  the  object 
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condition. 


Method 

Subjects.  A  total  of  7  university  students  participated  in  the  study.  Two  of  the  subjects  were 
aware  of  the  hypothesis  in  the  study. 

Apparatus.  The  stimuli  were  presented  using  an  IBM-compatible  486  computer  and  a  14  inch 
VGA  colour  monitor.  Responses  were  recorded  using  the  numeric  keypad  on  the  computer’s  keyboard. 

The  experiments  were  created  and  run  using  Micro  Experimental  Laboratory  (MEL)  version  2.0 
(Schneider,  1995).  Response  times  were  accurate  to  1  ms. 

Stimuli.  The  stimuli  were  one  red  dashed  line  and  one  green  dashed  line  presented  against  a  grey 
background.  The  lines  intersected  at  their  midpoint  at  the  centre  of  the  display.  One  line  was  horizontal 
and  the  other  line  was  tilted  at  an  angle  of  18°.  Each  dashed  line  was  equally  likely  to  be  red  or  green.  The 
horizontal  line  was  18.9  cm  long  and  the  tilted  line  was  20.2  cm.  At  a  fixed  viewing  distance  of  80  cm  the 
whole  display  extended  slightly  over  14°  of  visual  angle.  Each  line  contained  15  elements  equally  spaced. 
The  dashes  were  0.9  cm  in  length  and  two  pixels  in  height. 

For  each  trial  two  of  the  dashes  were  replaced  with  the  targets,  a  dot  and  a  gap.  The  dot  was  placed 
at  the  centre  of  the  dash  it  replaced  and  the  gap  element  was  made  by  eliminating  one  of  the  dashes.  The 
targets  (dot  or  gap)  were  located  toward  the  periphery  of  the  display  as  the  third  or  fourth  dash  from  the  end 
of  each  line.  Within  each  trial  one  target  was  placed  as  the  third  element  from  one  end  and  the  other  target 
was  then  placed  as  the  fourth  element  from  the  other  end.  This  was  to  avoid  any  unintended  symmetries  or 
subjective  contours  that  might  arise  between  the  target  elements  in  some  conditions.  Targets  were  equally 
likely  to  appear  as  the  third  or  fourth  element  on  the  horizontal  or  tilted  line  and  the  two  targets  were 
equally  likely  to  appear  in  the  horizontal  and  the  tilted  line.  On  half  of  the  trials  the  targets  were  the  same 
and  on  the  other  half  different.  When  the  target  were  the  same  they  were  equally  likely  to  be  two  dots  or 
two  gaps.  Finally  the  targets  were  equally  likely  to  occur  within  any  of  the  three  conditions  (near,  object, 
far).  The  distance  between  the  target  elements  (measured  between  their  centres  for  convenience)  was  1.4° 
or  1.7°  in  the  near  condition,  depending  on  the  precise  location  of  the  targets  in  the  display,  8.8°  in  the 
object  condition  and  8.4°  in  the  far  condition.  A  total  of  96  different  displays  were  constructed  to  account 
for  all  the  possible  variations. 

Procedure.  Subjects  viewed  the  screen  in  a  darkened  room  with  their  head  stabilized  in  a  chinrest. 
Each  trial  began  with  a  fixation  point  displayed  at  the  centre  of  the  screen  for  1  second.  Following  the 
fixation  point,  the  lines  appeared  simultaneously  for  177  mss;  the  brief  presentation  of  the  lines  prevented 
the  subjects  from  physically  scanning  the  display.  Subjects  were  required  to  make  a  speeded  judgement  of 
whether  the  targets  were  the  same  (two  dots  or  two  gaps)  or  different  (a  dot  and  a  gap)  using  the  numeric 
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keypad  on  the  keyboard.  Subjects  pressed  “0”  for  same  and  “2”  for  different.  Error  feedback  was  given 
immediately  by  the  presentation  of  a  500  ms  computer  tone. 

Subjects  were  explicitly  instructed  that  the  target  would  appear  towards  the  periphery  of  the 
display  in  any  possible  combination.  The  displays  were  presented  in  random  intermixed  order  in  blocks  of 
64  trials.  After  each  block  the  subject  was  given  the  opportunity  to  rest  before  starting  the  next  block. 
There  were  11  blocks,  1  practice  plus  10  experimental  for  a  total  of  640  experimental  trials. 


Results 

Mean  reaction  times  on  correct  trials  for  the  six  experimental  conditions  are  given 
in  Figure  1.'  It  is  clear  from  the  figure  that  the  object  condition  (for  both  same  and 
different  responses)  had  the  lowest  latencies.  The  latencies  were  analyzed  with  a  2(target: 
same  vs.  different)  x  3(condition:  near,  object,  far)  ANOVA  with  repeated  measures  on 
all  factors.  There  was  a  significant  main  effect  of  condition,  F(2,12)  =  6.295,  MSB  = 
1594.139,  p  <  0.05.  A  comparison  of  the  object  and  far  conditions  using  a  2(target:  same 
vs.  different)  x  2(condition:  object  vs.  far)  ANOVA  also  showed  a  significant  effect  of 
condition,  F(l,6)  =  1 .112,  MSB  =  1642.985,  p  <  0.05,  as  did  a  2  x  2  ANOVA  comparing 
the  object  and  near  conditions,  F(l,6)  =  6.441,  MSB  =  2644.173,  p  <  0.05. 

Discussion 

In  sum,  subjects  were  fastest  to  respond  in  the  object  condition.  This  is  consistent 
with  the  object-based  hypothesis  that  attention  is  assigned  to  features  that  are  grouped 
into  objects  based  on  Gestalt  grouping  principles.  One  implication  of  this  result  for  HUD 
design  is  that  attention  will  be  assigned  to  elements  in  a  symbology  set  that  form  coherent 
objects  or  perceptual  groups.  This  could  potentially  have  very  important  design 
implication  both  for  the  form  of  individual  symbols  and  for  the  use  of  grouped  symbols 
in  HUDs. 


’  Throughout  the  experiments  reported  in  this  section,  the  primary  analyses  focused  on  latencies.  Error 
rates  were  generally  uninformative  and  are  thus  not  discussed. 
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Figure  1.  Mean  reaction  times  by  condition  for  Experiment  1.  NS:  near  same;  ND:  near  different;  OS: 
object  same;  OD:  object  different;  FS:  far  same;  FD:  far  different 


EXPERIMENT  2 
Object  Effects  and  Spatial  Cueing 

An  important  aspect  of  the  overall  safety  for  pilots  using  HUD  is  the  presence  of 
clear  and  effective  warning  signals.  Research  has  shown  that  the  sudden  onset  of  a 
stimulus  serves  as  an  effective  exogenous  cue  for  drawing  attention  to  a  particular 
location  (Posner,  1980;  Stelmach  &  Herdman,  1991;  Wright  &  Ward,  1998).  In  HUDs, 
such  cues  may  be  instantiated  as  transient  or  flashing  symbols. 

When  dealing  with  a  of  multitude  of  information  in  both  the  near  and  far  domains, 
a  warning  signal  has  to  effectively  capture  pilots’  attention  away  from  any  other  sources 
of  information.  As  the  problem  of  cognitive  tunnelling  demonstrates,  the  presence  of 
perceptual  groups  and  coherent  objects  in  the  symbology  might  capture  attention  to  the 
point  of  hindering  pilots’  detection  of,  and  responses  to  the  warning  signals.  It  is  also 
possible  that  effective  warning  cues  might  disrupt  the  processing  of  perceptual  groups  in 
the  symbology  in  unforeseen  ways. 
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The  purpose  of  Experiment  2  was  to  examine  the  interaction  between  object- 
based  attention  and  exogenous  cueing.  This  was  done  by  adding  an  exogenous  spatial  cue 
to  the  object-based  attention  paradigm  used  in  Experiment  1. 


Method 

Subjects.  A  total  of  24  undergraduate  students  participated  in  this  study.  The  students  received 
partial  course  credit  for  participating  in  this  study. 

Apparatus  and  Stimuli.  The  apparatus  was  the  same  as  in  Experiment  1.  The  stimuli  were  similar 
to  those  used  in  Experiment  1,  with  the  following  exceptions.  Instead  of  using  red  and  green  as  line  colours, 
one  line  was  presented  in  pink  and  the  other  in  yellow.  The  two  lines  (horizontal  and  tilted)  were  equally 
likely  to  be  yellow  or  pink.  The  targets  were  a  white  dot  or  a  white  dash  (same  length  as  the  non-target 
dashes)^. 

Procedure.  Stimuli  (the  dashed  lines  and  the  targets)  were  presented  for  130  ms  rather  than  for  the 
1 77  ms  used  in  Experiment  1 .  The  lines  were  preceded  by  spatial  cues  that  appeared  for  66  ms  on  either  the 
left  or  the  right  side  of  the  display  (ISI  of  0  ms).  The  spatial  cues  consisted  of  the  endmost  dash  of  both 
lines.  The  early  onset  of  the  spatial  cues  was  presumed  to  draw  subjects’  attention  to  the  cued  side  of  the 
display.  The  interval  between  the  onset  of  the  cue  to  the  offset  of  the  display  was  brief,  thereby  preventing 
overt  attentional  or  ocular  shifts. 

The  cue  was  valid  on  70%  of  the  trials.  That  is,  on  70%  of  the  total  trials  the  cue  was  followed  by 
the  occurrence  of  the  targets  in  the  near  condition  (close  together  on  separate  lines)  on  the  side  of  the  cue 
(valid-near).  The  remaining  trials  were  equally  likely  to  occur  within  the  near  condition  on  the  side 
opposite  to  the  cue  (invalid-near)  or  within  the  object  condition  (invalid-object)  or  far  condition  (invalid- 
far). 

As  in  Experiment  1,  the  targets  were  the  same  or  different  equally  often  for  each  of  the  conditions. 
Each  block  consisted  of  80  trials,  8  displays  from  each  of  the  invalid  conditions  and  56  displays  from  the 
valid  near  condition  on  average.  The  cue  was  equally  likely  to  appear  on  left  or  right  side.  Subjects  started 
with  a  short  block  of  12  demonstration  trials  followed  by  1 1  blocks  of  80  trials  each.  The  subjects  were 
explicitly  instructed  to  pay  close  attention  to  the  cue  since  the  targets  would  appear  most  often  on  the  cued 
side. 


“  The  use  of  these  colours  was  prompted  by  concerns  over  the  possible  effects  of  line  colour  on  target 
discrimination. 
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Results 

Subjects’  mean  reaction  times  for  correct  trials  were  computed  for  the  eight 
experimental  conditions  (factorial  combination  of  the  four  target  location  conditions  and 
the  two  target  length  conditions)  and  are  shown  in  Figure  2.  The  latencies  were  analyzed 
with  a  4(condition:  near-valid,  near-invalid,  object-invalid,  far-invalid)  x  2(target:  same 
vs.  different)  repeated-measures  ANOVA.  The  analysis  showed  that  the  main  effect  of 
condition,  F(3,69)  =  9.758,  MSE  =  3543.196,  p  <  .001,  was  significant.  A  comparison  of 
the  near-valid  and  near-invalid  conditions  with  a  2(condition)  x  2(target)  ANOVA  also 
showed  a  significant  effect  of  condition,  F(l,23)  =  20.324,  MSE  -3189.147,  p  <  .001,  as 
did  comparisons  of  the  near-valid  and  object-invalid  conditions,  F(l,23)  =  1 1 .394,  MSE 
=  2775.934,  p  <  .005,  the  near-invalid  and  far-invalid  conditions,  F(l,23)  =  21.139,  MSE 
=  4200.791,  p  <  .001,  and  of  the  object-invalid  and  far-invalid  conditions,  F(l,23)  = 
5.566,  MSE  =  2593.916,  p  <  .05.  Taken  together,  these  results  show  that  the  latencies  for 
the  near-valid  condition  are  significantly  lower  than  for  the  near-invalid,  object-invalid 
and  far-invalid  conditions;  and  that  the  object-invalid  trials  were  reliably  faster  than  the 
far-invalid  trials. 


Figure  2.  Mean  reaction  times  for  Experiment  2.  NSV:  near-valid,  same;  NDV:  near-valid,  different;  NSI: 
near-invalid,  same;  NDI:  near-invalid,  different;  OSI;  object-invalid,  same;  ODI:  object-invalid,  different; 

FSI:  far-invalid,  same;  FDI:  far-invalid,  different. 
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In  sum,  the  main  result  of  this  experiment  is  that  the  cueing  of  one  side  of  the 
display  had  a  significant  impact  on  the  pattern  of  reaction  times  as  a  function  of  condition 
that  was  found  in  Experiment  1 .  Unlike  the  result  from  the  Experiment  1,  which  showed 
faster  responses  in  the  object  condition,  the  lower  latencies  for  this  experiment  were 
found  for  the  near-invalid  conditions.  The  slowest  responses  were  found  in  \h.Q  far- 
invalid  conditions.  These  results  show  that  spatial  cueing  eliminated  the  effects  of  object- 
based  attention. 

Discussion 

This  experiment  suggests  that  attentional  capture  by  the  use  of  spatial  cueing  can 
have  potentially  disruptive  effects  on  the  mechanisms  involved  in  object-based  attention. 
The  spatial  cues  clearly  facilitated  responses  to  both  types  of  near  targets  (valid  and 
invalid),  to  the  extent  that  the  advantage  displayed  for  the  object  condition  found  in 
Experiment  1  was  eliminated.  Another  significant  outcome  of  this  experiment  is  that  the 
perceptual  grouping  of  the  dashes  into  longer  lines,  presumably  the  basis  for  the  object 
effects  observed  in  Experiment  1,  did  not  prevent  the  cueing  from  being  effective.  This 
suggests  that  the  grouping  principles  which  possibly  account  for  object-based  attention 
do  not  interfere  with  spatial  cueing  to  a  large  degree,  at  least  not  when  the  cues  and  the 
perceptual  groups  overlap  spatially  (note,  however,  the  experiments  on  cognitive 
tunelling  suggest  otherwise  for  cues  and  objects  that  do  not  directly  overlap,  be  they  in 
the  near  or  far  domains;  see  Fischer,  Haines  &  Price,  1980;  McCann  &  Foyle,  1996; 
Wickens  &  Long,  1994). 

What  is  the  basis  for  this  effect  of  cueing  on  object-based  attention?  In  an 
experiment  using  the  same  paradigm,  Lavie  and  Driver  (1996)  found  relative  ranking  of 
the  latencies  according  to  condition  as  follows:  near-valid,  near-invalid,  object-invalid, 
far-invalid.  On  its  own,  this  ranking  is  suggestive  of  spatial  effects  typically  associated 
with  spatial  cueing  (see  Posner,  1980).  Spatial  cueing  might  momentarily  disrupt 
mechanisms  of  object-based  attention  by  changing  the  size  of  subjects’  attentional 
“spotlight,”  in  line  with  the  zoom-lens  account  of  Eriksen  and  colleagues  (Eriksen  & 
Eriksen,  1974;  Eriksen  &  St-James,  1986).  The  premise  in  this  argument  is  presumably 
that  the  mechanisms  of  object-based  attention  can  only  operate  in  a  stable  attentional 
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spotlight,  thus  implying  some  sort  of  hierarchical  relationship  between  spatial  and  object- 
based  attentional  operators. 

However,  the  evidence  for  this  account  is  far  from  conclusive.  Examination  of  the 
invalid  trials  in  this  experiment  shows  that  the  lowest  latencies  were  found  for  the  object- 
invalid  condition,  followed  in  order  by  the  near-invalid  and  far-invalid  conditions.  The 
difference  in  latencies  was  not  found  to  be  significant,  except  in  the  case  of  th.&  far- 
invalid  condition  when  compared  to  the  object-invalid  condition.  If  further  research  were 
to  confirm  the  findings  of  this  experiment,  then  it  could  be  argued  that  spatial  cueing  of  a 
region  of  a  display  facilitates  responses  to  events  in  that  region,  without  necessarily 
disrupting  the  effectiveness  of  object-based  attentional  operators.  Furthermore,  the 
findings  related  to  cognitive  tunelling  suggest  that  spatial  cueing  might  have  a  differential 
impact  on  the  processing  of  perceptual  groups  depending  on  whether  the  cues  and  the 
perceptual  groups  overlap  in  space.  Further  research  is  clearly  needed  on  this  important 
issue.  ~ 

The  apparent  contrast  between  these  present  results  and  those  reported  in 
experiments  on  cognitive  tunnelling  also  suggest  that  the  degree  to  which  a  cue  is 
expected  and/or  predictable  could  be  a  factor  in  determining  its  effectiveness.  In  the 
studies  reported  by  Fischer,  Haines  and  Price  (1980)  and  Wickens  and  Long  (1994),  the 
events  that  failed  to  be  noticed,  or  were  reacted  to  more  slowly,  by  pilots  were  often  (but 
not  always)  unexpected  and  were  generally  presented  on  a  limited  number  of  trials.  The 
cues  used  in  this  experiment  were  fully  expected  to  occur  at  the  onset  of  each  trial  by  the 
subjects.  Moreover,  the  cues  used  in  this  experiment  were  highly  informative,  in  that  they 
predicted  the  location  of  targets  with  70%  accuracy,  which  provided  even  more  incentive 
to  attend  to  them.  Could  subjects’  expectation  that  the  cue  was  going  to  appear  and  their 
intention  to  seek  out  and  use  the  target  to  aid  their  performance  have  been  a  factor  in  the 
effectiveness  of  the  cues?  This  is  the  question  considered  in  Experiment  3 . 
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EXPERIMENT  3 
Explicitness  of  the  Spatial  Cueing 


Experiment  3  was  conducted  to  further  explore  the  impact  of  spatial  cueing  on 
object-based  attention.  Two  changes  were  made  from  Experiment  2.  First,  in 
Experiment  2,  spatial  cues  were  presented  for  only  66  ms;  research  has  shown,  however, 
that  exogenous  cueing  is  most  effective  at  around  100  ms  (Wright  &  Ward,  1998).  In 
this  experiment,  some  subject  saw  a  display  where  a  99  ms  cueing  interval  was  created  by 
presenting  cues  for  66  ms  followed  by  a  33  ms  blanking  interval  before  the  onset  of 
stimuli.  Second,  in  Experiment  2  subjects  were  explicitly  instructed  to  use  the  spatial 
cues.  In  the  present  experiment,  subjects  were  given  no  explicit  instructions  to  use  the 
cues  and  the  cue  validity  was  not  made  explicit;  in  fact,  no  mention  at  all  was  made  of  the 
presence  of  the  cues.  The  purpose  of  these  changes  was  twofold:  the  omission  of 
information  about  the  cues  from  the  instructions  to  subjects  was  aimed  at  examining  the 
effects  of  subjects’  expectations  about  the  cueing  on  cue  effectiveness;  the  use  of  an 
exogenous  cue  with  proven  effectiveness  was  meant  to  provide  a  baseline  to  compare  the 
effects  of  expected  cueing  (Experiment  2)  and  the  unexpected  cueing  of  this  experiment 
against. 

Method 

-Subjects.  A  group  of  20  first-year  undergraduate  psychology  students  participated  in  this  study. 
The  students  received  partial  course  credit  for  their  participation.  None  of  the  students  had  participated  in 
the  previous  experiments. 

Apparatus,  Stimuli  and  Procedure.  The  apparatus,  and  stimuli  were  the  same  as  in  Experiment  2 
with  the  difference  noted  earlier:  one  group  of  subjects  (N  =  1 1)  saw  displays  with  the  same  cues  as  were 
used  in  the  previous  experiment,  whereas  the  other  group  (N  =  9)  was  shown  displays  using  the  longer 
cueing  interval.  The  procedure  was  the  same  as  for  Experiment  2,  with  the  difference  that  the  subjects  in 
both  groups  were  given  no  information  about  the  presence  or  usefulness  of  the  cues. 

Results 

The  mean  reaction  times  for  each  experimental  condition  {near-valid,  near¬ 
invalid,  object-invalid,  far-invalid)  were  computed  for  correct  trials.  Figure  3  shows 
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latencies  as  a  function  of  cue  type  (short  cue:  panel  a;  long  cue:  panel  b).  These  latencies 
were  analyzed  with  a  repeated-measures  4(condition)  x  2(target:  same  or  different) 
ANOVA,  with  type  of  cue  as  a  between-subjects  factor.  The  ANOVA  showed  a 
significant  main  effect  of  condition,  F(3,54)  =  5.206,  MSB  =  5383. 134,  p  <  .005.  A 
comparison  of  the  near-valid  and  near-invalid  conditions  with  a  2(condition)  x  2(target) 
ANOVA  with  cue  as  a  between-subjects  factor  also  showed  a  main  effect  of  condition, 
F(l,18)  =  6.898,  MSB  =  2974.819,  p  <  .05,  that  was  significant.  No  other  2x2 
comparisons  between  conditions  revealed  significant  effects.  The  between-subjects  effect 
of  cue  type  failed  to  reach  significance  in  all  analyses. 

In  sum,  the  main  result  of  this  experiment  is  that  responses  were  fastest  for  the 
near-valid  and  near-invalid  conditions,  with  near-valid  being  significantly  faster  than 
even  near-invalid.  This  is  generally  consistent  with  the  findings  of  Bxperiment  2,  and 
with  the  findings  of  Lavie  and  Driver  (1996).  It  should  be  also  noted  that  the  between- 
subjects  factor  of  cueing  length  was  only  implicated  in  one  interaction.  Thus,  these 
results  show  that  the  length  of  the  cueing  interval  used  in  this  experiment  had  little 
impact  on  subject  performance. 

Discussion 

The  results  of  this  experiment  suggest  very  strongly  that  subjects  expectations 
about  the  occurrence  and  usefulness  of  cueing  has  little  impact  on  the  results  of 
Bxperiment  2.  If  fact,  many  subjects  denied  seeing  the  cues  at  all,  although  the 
overwhelming  majority  did  report  perceiving  the  lines  as  being  drawn  from  a  given  side 
of  the  display,  an  effect  which  was  also  reported  in  Bxperiment  2,  but  which  did  not 
occur  in  Bxperiment  1  where  no  spatial  cues  were  used.  Clearly,  the  cues  did  have  a 
phenomenological  impact  on  the  perception  of  the  display,  seemingly  generating  an 
apparent  motion  effect.  At  any  rate,  it  would  seem  that  the  spatial  cues  can  be  effective  in 
capturing  attention  in  a  display  of  grouped  elements  without  observers  expecting  them  or 
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Figure  3a.  Mean  reaction  times  for  Experiment  3,  short  cue  condition  (N  =  9). 
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Figure  3b.  Mean  reaction  times  for  Experiment  3,  long  cue  condition  (N  =  1 1) 


intending  to  use  them,  or  even  being  phenomenologically  aware  of  them  as  cues. 

The  similai'ity  between  the  reaction  times  for  the  object-invalid  condition  and 
both  types  of  near  condition,  and  the  significant  difference  between  the  object  and  far 
invalid  conditions  observed  in  Experiment  2  were  absent  in  the  case  of  this  experiment.  It 
is  unclear  whether  this  is  due  to  the  differences  in  cueing  between  Experiments  2  and  3, 
as  the  number  of  subjects  per  cueing  condition  in  Experiment  3  was  too  low  to  produce 
reliable  effects. 


EXPERIMENT  4 

Is  Colour  Necessary  To  Form  An  Object? 

Experiment  1  showed  a  strong  effect  of  object-based  attention  with  objects  (lines) 
that  were  defined  by  two  Gestalt  grouping  principles:  good  continuation  and  colour.  It  is 
not  clear,  however,  whether  both  of  these  attributes  were  required  to  create  two  separate 
objects.  In  systems  such  as  NVGs,  the  HUD  symbology  is  monochromatic.  If  colour 
differentiation  is  critical  for  creating  perceptual  objects,  then  designing  HUD 
symbologies  meant  to  exploit  the  attentional  benefits  of  objects  and  perceptual  grouping 
may  be  problematic  in  cases  where  the  HUD  must  be  monochromatic  or  is  restricted  to  a 
very  limited  colour  palette.  Experiment  4  was  designed  to  make  some  headway  on  this 
issue  by  using  the  displays  and  procedures  of  Experiment  1,  with  the  difference  that  the 
stimuli  of  Experiment  4  are  monochromatic.  Thus,  the  objects  (lines)  were  defined  only 
by  the  principle  of  good  continuation. 

Method 

Subjects.  A  total  of  10  first-year  undergraduate  psychology  students  participated  in  this  study. 
The  students  received  partial  course  credit  for  their  participation.  None  of  the  students  had  participated  in 
the  previous  experiments. 

Apparatus.  Stimuli  and  Procedure.  The  apparatus,  stimuli  and  procedure  were  the  same  as  in 
Experiment  1  with  the  exception  that  the  stimuli  (the  dashed  lines  and  the  target  consisting  of  a  dot)  were 
presented  in  black  on  a  light  grey  background.  The  second  target  was  a  gap  (a  missing  dash),  as  in 
Experiment  1. 


Results 

The  mean  latencies  for  the  six  experimental  conditions  on  correct  trials  are 
reported  in  figure  4.  Data  from  three  subjects  with  performance  approaching  chance  was 
discarded.  The  means  were  analyzed  with  a  3(condition:  near,  object,  far)  x  2(target: 
same  vs.  different)  repeated-measures  ANOVA.  The  analysis  showed  an  effect  of 
condition,  F(2,12)  =  5.044,  MSE  =  560.577,  p  <  .05,  which  was  significant. 


Figure  4.  Mean  reaction  times  for  Experiment  4. 


A  comparison  of  the  near  and  object  conditions  with  a  2(condition)  x  2(target)  ANOVA 
revealed  a  significant  effect  of  condition,  F(l,6)  =  8.482,  MSE  =  660.520,  p  <  .05.  No 
other  effect  achieved  significance,  although  the  effect  of  condition  approached 
significance  in  a  comparison  of  the  object  and  far  conditions,  F(l,6)  =  4.038,  MSE  = 
240.207,  p<.l. 

The  main  result  of  this  experiment  is  that  the  object  condition  produced  the  lowest 
mean  latencies  for  the  three  conditions.  This  replicates  the  results  of  Experiment  1  and  is 
consistent  with  the  object-based  hypothesis. 
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Discussion 


The  results  of  this  experiment  show  that  colour  is  not  a  necessary  feature  for 
perceptual  grouping.  This  has  important  implications  for  HUDs  with  systems  that  offer 
only  monochromatic  symbologies  (e.g.,  NVGs),  in  that  such  systems  should  still  support 
the  allocation  of  attention  in  an  object-based  manner,  thereby  allowing  the  design  of  these 
symbologies  to  exploit  the  insights  gained  from  research  on  object-based  attention  as 
well. 


However,  this  is  not  to  say  that  colour  plays  no  role  in  perceptual  grouping.  A 
comparison  of  Figures  1  and  4  (colour  &  no  colour)  shows  that  the  effect  of  objectness 
was  smaller  in  Experiment  4  than  in  Experiment  1,  by  about  20  ms.  Furthermore,  there  is 
a  suggestive  trend  towards  trials  in  the  far  condition  being  faster  than  in  the  near 
condition.  This  could  conceivably  be  due  to  the  display  being  grouped  into  a  single,  large 
object  (i.e.  a  tilted  ‘X’)  in  the  far  condition,  while  failing  to  do  so  in  the  near  condition, 
perhaps  because  not  enough  of  the  display  is  being  attended  to  in  order  for  grouping  into 
a  single  object  to  occur  in  the  latter  case.  If  these  results  are  borne  out  by  further  research, 
it  could  suggest  a  role  for  colour  as  an  adjunctive  grouping  factor,  strengthening  certain 
groupings  primarily  determined  by  symbol  geometry.  These  same  results  also  suggest  the 
idea  that  perceptual  grouping  might  be  a  hierarchical  process,  wherein  certain  objects  are 
the  result  of  the  grouping  of  other,  simpler  objects.  If  this  is  true,  the  design  of  HUDs 
could  benefit  from  the  hierarchical  nesting  of  symbols  to  facilitate  the  integration  of 
information  within  the  symbology  and  to  deal  with  issues  such  as  visual  clutter  and 
attentional  capture. 


SUMMARY  OF  EXPERIMENTS  1  -  4 


Experiments  1-4  have  been  important  in  establishing  an  object-based  attention 
paradigm  which  is  reliable  and  which  allows  for  the  systematic  study  of  a  variety  of 
factors  which  might  interact  with  object-based  attention.  One  set  of  factors  which  can  be 
explored  with  this  paradigm  are  spatial  factors.  Initial  steps  have  been  taken  toward 


examining  the  relation  between  object-based  attention  and  spatial  cueing  (Experiments  2 
&  3).  This  is  particularly  relevant  to  the  development  of  a  framework  for  designing  and 
testing  warning  signals  for  HUDs  in  a  principled  manner.  Another  set  of  factors  that  can 
be  manipulated  in  this  paradigm  are  the  factors  which  determine  the  grouping  of  elements 
into  coherent  wholes  or  objects.  One  such  factor  that  was  investigated  was  colour 
(Experiment  4).  The  results  of  this  study  suggest  that  colour  does  not  play  a  primary  role 
in  perceptual  grouping,  although  it  might  well  play  an  ancillary  role.  This  is  good  news 
for  the  design  of  HUDs  for  systems  such  as  NVGs  which  require  monochromatic 
symbology. 

Experiments  1-4  raise  a  number  of  questions.  One  such  question  has  to  do  with 
the  exact  nature  of  the  interaction  between  spatial  cueing  and  object-based  attention. 
Another  concerns  the  role  of  hierarchical  perceptual  grouping  in  attention.  Finally,  as 
these  experiments  do  not  address  issues,  which  might  arise  from  the  use  of  symbology  in 
dynamic  displays,  including  the  use  of  symbology  which  itself  is  dynamic. 

It  is  known  that  common  motion  is  a  very  powerful  grouping  factor,  as  has  been 
revealed  by  work  on  the  development  of  object  perception  (Spelke,  Gutheil  &  Van  de 
Walle,  1995)  and  further  supported  by  experimental  work  in  cognitive  neuroscience 
(Valdes-Sosa,  Cobo  &  Pinilla,  1998).  In  the  area  of  HUD  design,  recent  HUD  prototypes, 
such  as  the  Technical  Panel  Two  (TP2)  symbology  set,  segregate  groups  of  related 
symbols  by  the  use  of  common  motion.  Can  the  grouping  of  elements,  which  can  be 
perceived  and  attended  to  as  objects  in  their  own  right,  lead  to  attention  being  allocated  to 
the  group  as  if  it  were  an  object  itself?  If  so,  will  this  be  of  use  in  addressing  the 
problems  of  integrating  information  both  within  a  domain  and  across  domains? 

EXPERIMENT  5 

Object-Based  Attentional  Layers  in  a  Dynamic  Display 

In  aircraft,  stimuli  in  both  the  near  (HUD)  and  the  far  (external  scene)  domains 
move.  This  dynamic  visual  context  differs  from  the  static  displays  that  have  been  used 
by  researchers  to  examine  object-based  attention.  Accordingly,  the  purpose  of 
Experiment  5  was  to  translate  the  object-based  attention  paradigm  used  in  Experiments  1 
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-  4  to  a  situation  where  elements  on  the  display  are  moving.  Experiment  5  was  not 
intended  to  fully  mimic  a  complex  HUD-equipped  aircraft  environment,  but  instead,  to 
take  a  logical  and  significant  step  forward. 

This  experiment  makes  two  significant  contributions.  First,  this  experiment 
demonstrates  a  capability  in  the  development  of  a  paradigm  that  can  be  used  to  assess 
object-based  attention  in  a  dynamic  context.  This  capability  will  enable  future  research 
on  object-based  attention  in  HUD  displays.  Second,  this  experiment  is  the  first  to  show 
that  object-layers  are  formed  through  common  motion  and  that  these  object-layers  can  be 
attended. 

In  the  current  paradigm,  subjects  were  presented  with  a  number  of  dots  on  a 
display.  A  subset  of  these  dots  moved  in  unison  (common  motion)  while  the  other  dots 
remained  stationary.  It  is  known  that  common  fate  (or  common  motion)  is  one,  if  not  the 
strongest  of  the  perceptual  grouping  factors  (Spelke  et  al.,  1995).  The  moving  dots 
formed  what  is  termed  a  moving  layer.  The  stationary  dots  formed  a  static  layer.  The 
creation  of  these  two  layers  is  analogous  to  the  layering  that  may  occur  when  a  HUD 
(near  layer)  is  superimposed  on  the  external  scene  (far  layer).  One  obvious  difference  is 
that  in  an  aircraft  the  external  scene  is  also  moving,  but  presumably  not  in  common  fate 
with  the  movements  of  the  HUD  symbology. 

The  creation  of  two  layers  in  the  current  paradigm  is  also  what  occurs  when  frame 
of  reference  (FOR)  is  mixed  on  HMDs.  For  example,  the  Technical  Panel  Two  (TP2) 
symbology  set  includes  both  head  referenced  and  aircraft  referenced  symbology.  The 
head-referenced  symbology  is  yoked  to  the  pilot’s  head  movements,  whereas  the  aircraft 
symbology  is  yoked  to  the  axis  of  the  aircraft.  Accordingly,  when  the  pilot  moves  his/her 
head,  the  head-referenced  symbology  moves  with  the  head  and  independently  from  the 
aircraft  referenced  symbols. 

The  object-based  attention  hypothesis  is  that  attention  can  be  allocated  to  the 
moving  versus  the  static  layers  of  dots.  This  should  result  in  faster  processing  of 
elements  within  a  layer  as  compared  to  across  layers  or  in  an  unattended  layer.  To  test 
this,  subjects  performed  a  same/different  task  where  two  of  the  dots  changed  to  either  the 
same  or  a  different  colour.  The  dots  that  changed  were  either  from  the  same  layer  of  on 
different  layers.  Subjects  performed  a  series  of  trials  where  they  attempted  to  focus 
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attention  on  the  moving  layer  and  a  series  of  trials  where  they  focused  attention  on  the 
static  layer.  The  object-based  attention  was  supported. 

Method 

Observers.  Four  observers  participated  in  the  experiment.  Two  had  participated  in  Experiment  1. 

Apparatus.  The  display  was  presented  on  a  ViewSonic  17  inch  colour  m-Onitor  with  .25  dot  pitch  . 
(model  PS790)  controlled  by  a  Cambridge  Research  Systems  VSG  2/3  video  board  installed  in  a  Pentium- 
powered  IBM  compatible  computer.  Observer  responses  were  collected  using  a  response  box  equipped 
with  microswitches.  The  response  box  was  connected  to  the  VSG  board.  JResponse  times  were  accurate  to 
less  than  1  ms. 

Stimuli.  The  display  consisted  of  seven  light  grey  dots  shown  against  a  dark  grey  background  at 
random  positions.  Four  of  the  dots  stayed  stationary  during  a  trial,  whereas  the  other  three  dots  moved  in 
unison.  At  some  point  in  the  trial  tu'o  of  the  dots  in  the  display  changed  colour,  either  to  red  or  to  green. 
Each  of  these  two  dots  was  equally  likely  to  be  within  the  moving  group  or  the  stationaiy  group.  The 
whole  display  subtended  roughly  14°  of  visual  angle;  each  dot  had  a  diameter  of  0.3°.  The  moving  dots 
had  an  elliptical  trajectory:  the  trajectory’s  parameters  (direction  of  rotation,  high  and  width)  were  varied 
from  trial  to  trial. 

Design.  There  were  12  conditions  in  the  experiment,  which  were  defined  by  the  2  x  3  x  2 
factorial  combination  of  colour  (same  vs.  different),  location(moving  layer,  static  layer,  both  layers)  and 
attentional  focus  (moving  layer  vs.  static  layer).  A  completely  repeated  measures  design  was  used  in  which 
all  combinations  of  conditions  were  experienced  by  each  observer. 

Colour.  There  were  two  colour  conditions;  both  target  dots  changing  to  the  same  colour  (i.e., 
either  red  or  green)  versus  each  target  dot  changing  to  a  different  colour. 

Location.  There  were  three  location  conditions.  Both  target  dots  occurred  within  the  moving 
layer  of  dots.  Both  target  dots  occurred  within  the  static  layer  of  dots.  Or,  one  target  dot  within  each  layer. 

Attentional  focus.  There  were  two  focus  conditions.  Observers  were  instructed  to  focus  attention 
on  the  moving  layer  of  dots  versus  on  the  static  layer  of  dots. 

Procedure.  A  two-altemative  forced-choice  (2AFC)  procedure  was  used.  On  each  trial,  the  dots 
were  displayed  for  a  variable  interval  lasting  between  two  and  six  seconds,  during  which  time  alt  dots 
remained  light  grey  and  the  moving  group  of  dots  described  their  elliptical  trajectory.  At  the  end  of  the 
variable  interval,  two  randomly  selected  dots  in  the  display  changed  colour,  while  the  moving  dots 
continued  their  trajectory.  The  display  with  the  coloured  dots  lasted  until  the  observer  responded,  or  until 
the  display  timed  out  after  three  seconds.  The  observers’  task  was  to  determine  whether  the  two  dots  that 
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had  just  changed  colour  had  taken  on  the  same  or  different  colour  (possible  colours  were  red  and  green). 
Observers  responded  by  pressing  one  button  on  the  response  box  for  a  “same”  Judgement,  and  another 
button  for  a  “different”  judgement. 

Each  observer  completed  6  blocks  of  60  trials  (total  of  360  trials),  Each  block  w.as.  initiated  by  the 
observer,  whereas  each  trial  was  initiated  automatically  following  a  brief  (2  sec.)  pause.  For  each  block, 
the  observer  was  instructed  to  focus  attention  on  either  the  moving  layer  of  dots  or  on  the  static  layer  of 
dots.  The  order  of  attentional  focus  was  counterbalanced  across  observers.  Observers  were  to  respond 
both  as  quickly  and  as  accurately  to  the  target  (colour)  dots. 

Results 

Mean  reaction  times  on  correct  trials  were  computed  for  the  12  experimental 
conditions.  Figure  5  shows  latencies  as  a  function  of  attention  focus  to  the  moving  layer 
(panel  a)  versus  the  static  layer  (panel  b).  These  latencies  were  analyzed  with  a  2  (colour: 
same  vs.  different)  x  3(layer:  moving  layer,  static  layer,  both  layers)  x  2  (attentional 
focus:  moving  layer  vs.  static  layer)  ANOVA  with  repeated  measures  on  all  factors.  The 
ANOVA  showed  significant  two-way  interactions  between  layer  and  focus,  F(2,6)  = 
61.985,  MSB  =  694.700,  p  <  .001,  layer  and  response,  F(2,6)  =  7.716,  MSB  =  80.136,  p  < 
.05,  and  a  three-way  interaction  between  layer,  response  and  focus,  F(2,6)  =  79.070,  MSB 
=  182.740,  p<. 001. 

Of  particular  interest  was  the  significant  interaction  between  location  and  focus. 
As  shown  in  Figure  1,  response  latencies  to  targets  were  markedly  lower  when  the  targets 
occurred  in  the  attended  layer.  When  attending  to  the  moving  layer  (panel  a)  latencies 
were  fastest  to  target  dots  that  occurred  in  the  moving  layer,  as  opposed  to  the  static  layer 
or  across  (both)  layers.  In  contrast,  when  attending  to  the  static  layer  (panel  b),  latencies 
were  faster  to  target  dots  that  occurred  in  the  static  layer  as  opposed  to  dots  occurring  in 
the  moving  layer  or  across  layers.  This  pattern  was  further  supported  in  a  separate 
ANOVA  comparing  latencies  in  the  moving  versus  the  static  conditions  which  showed  a 
significant  interaction  between  layer  and  focus,  F(l,3)  =  7 1.498.  MSB  =  483.932,  p  < 
.005.  :  - 

In  sum,  the  present  results  show  that  observers  were  able  to  use  common  motion 
to  partition  the  complete  set  of  dots  into  moving  versus  static  object  layers  and  that 
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Figure  5a.  Mean  reaction  times  by  condition  of  target  for  Experiment  5;  focus  on  static  layer.  B_S_ST  & 
B_D_ST:  both  layers;  ST_S_ST  &  ST_D_ST;  static  layer;  MV  S  ST  &  MV  D  ST:  moving  layer. 


Figure  5b.  Mean  reaction  times  by  condition  for  Experiment  5,  focus  on  moving  layer.  B_S_M^^  & 
B_D_MV:  both  layers;  ST__S_MV  &  ST_D_MV:  static  layer;  MV_S_MV  &  MV_D_MVLrnoying  layer. 
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attention  could  be  strategically  allocated  to  one  or  the  other  layer.  The  size  of  this  object- 
based  attention  effect  was  substantial  (80  ms  on  average)  and  appeared  to  be 
symmetrical.  The  mean  latencies  show  that  the  advantage  of  the  attended  layer  is 
reflected  mainly  in  the  condition  where  both  of  the  target  dots  have  the  same  colour. 
Responses  to  dots  of  the  same  colour  were  on  average  1 00  ms  faster  than  to  dots  of 
different  colour  in  the  attended  layer.  This  same  colour  advantage  isTeversed  when  the 
target  dots  occur  in  the  unattended  layer.  In  this  case,  observers  respond  to  dots  of  the 
came  colour  more  slowly,  on  average  by  about  30  ms.  The  meaning  of  this  reversal  is 
unclear. 

Discussion 

The  present  results  are  consistent  with  the  object-based  attention  hypothesis  that 
elements  within  a  single  perceptual  object  will  be  processed  more  quickly  than  elements 
from  more  than  one  object.  An  important  contribution  of  this  research  is  in  showing  (a) 
that  elements  in  multi-element  displays  can  be  partitioned  into  object  layers  through  the 
use  of  common  motion  and  (b)  that  attention  can  be  effectively  allocated  to  object  layers 
in  dynamic  displays. 

Several  issues  concerning  the  impact  of  object  layering  remain  to  be 
systematically  examined.  For  one,  it  is  not  clear  from  the  present  experiment  whether  the 
object-layering  effect  is  facilitating  or  inhibitory.  Clearly,  grouping  elements  by  common 
motion  favours  the  processing  of  elements  within  the  group,  but  does  this  happen  at  the 
expense  of  an  observer’s  ability  to  attend  to  the  other  parts  of  the  display?  To  this  end,  a 
series  of  baseline  conditions  need  to  be  developed  to  determine  whether  attending  to  an 
object  layer  speeds  responses  to  targets  in  that  layer,  or  whether  attending  to  one  layer 
functions  to  slow  responses  to  target  that  occur  outside  the  attended  layer.  This  could 
have  a  major  impact  on  the  processing  of  warning  signals  or  other  unexpected  events, 
which  might  happen  outside  of  the  layer  being  attended  to,  as  is  observed  with  cognitive 
tunnelling. 
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SECTION  THREE 


1.  FINAL  DISCUSSION  AND  SUMMARY 

Traditional  aircraft  instrumentation  set-ups,  which  force  pilots  to  physically  scan 
between  their  flight  instruments  and  the  external  scene,  have  been  associated  with 
performance  decrements  in  certain  situations.  To  address  this  issue,  heads-up  displays 
(HUDs)  have  been  developed  which  project  symbolic  representations  of  the  flight 
instruments  over  the  pilot’s  forward  filed  of  view.  Although  HUDs  have  shown  some 
promise  in  improving  flight  performance,  they  have  also  been  associated  with  a  number 
of  attentional  problems.  These  problems  are  generally  associated  with  difficulties  in 
dividing  attention  between  the  HUD  symbology  (the  near  domain)  and  the  external  scene 
(the  far  domain).  It  has  been  shown  that  pilots  are  more  likely  to  miss  unexpected  events 
in  the  external  scene  while  attending  to  the  HUD,  a  phenomenon  which  has  been 
attributed  to  cognitive  tunnelling  on  the  HUD  symbology  (Martin-Emerson  &  Wickens, 
1997).  Also,  the  sheer  number  of  elements  in  the  HUD  symbology  clutters  the  field  of 
view  and  thereby  makes  it  difficult  for  pilots  to  (a)  use  the  information  from  the 
symbology  in  an  efficient  and  integrated  fashion  and  (b)  look  beyond  the  symbology  to 
the  external  scene. 

An  object-based  attention  framework  has  been  used  to  account  for  some  of  the 
attentional  problems  associated  with  HUDs.  On  this  view,  the  HUD  is  assumed  to  form 
one  perceptual  group  or  object,  which  captures  attention  at  the  expense  of  attending  to  the 
external  scene.  In  Experiment  1  of  the  present  research,  an  object-based  attention 
framework  was  established  which  allows  for  a  systematic  study  of  the  factors  which  can 
interact  with,  and  are  often  confounded  with,  object-based  factors  in  the  allocation  of 
visual  attention.  This  experiment  showed  that  subjects’  ability  to  process  targets  was 
determined  by  object-based  factors  rather  than  spatial  factors.  This  finding  is  important 
because  spatial  and  object-based  attentional  operators  have  often  been  confounded  in 
research  on  HUDs  (e.g.  see  McCann,  Foyle,  &  Johnston,  1993).  Experiment  2  examined 
the  interaction  of  object-based  attention  and  spatial  cueing.  The  results  showed  that 
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spatial  cueing  is  able  to  capture  attention  at  the  expense  of  allocating  attention  to 
perceptual  objects  in  the  display.  Experiment  3  investigated  how  the  effectiveness  of 
spatial  cueing  might  be  influenced  by  the  cognitive  salience  of  cueing.  Subjects’ 
attention  was  captured  just  as  effectively  whether  or  not  they  had  been  informed  about 
the  existence  of  the  spatial  cues  and  their  relevance  to  the  task.  These  results  have 
implications  for  the  use  of  warning  signals  and  demonstrate  that  spatial  and  object-based 
factors  can  interact  in  unexpected  ways  in  HUDs. 

Experiments  4  and  5  examined  the  role  of  individual  grouping  principles  in  the 
allocation  of  attention  to  objects.  Experiment  4  showed  that  subjects  were  able  to 
allocate  attention  to  individual  perceptual  objects  regardless  of  whether  they  were 
distinguished  by  different  colours.  This  is  an  important  finding  for  HUD  systems  which 
must  make  use  of  a  restricted  range  of  colours;  this  is  particularly  true  of  NVGs,  where 
both  the  symbology  and  the  external  scenery  are  displayed  using  the  same  colour.  In 
Experiment  5,  a  step  was  taken  toward  the  study  of  object-based  attention  in  dynamic 
displays  which  more  closely  resemble  actual  HUD  environments.  Experiment  5  showed 
that  subjects  are  able  to  attend  to  groups  of  objects  that  are  segregated  by  common 
motion.  This  suggests  that  common  motion  might  be  a  useful  factor  in  improving  the 
organization  of  HUD  symbology. 

The  Experiment  5  finding  that  attention  can  be  allocated  to  objects  (or  object 
layers)  that  are  defined  by  common  motion  has  direct  implications  for  the  Technical 
Panel  2  (TP2)  symbology  set.  The  TP2  set  was  developed  for  use  in  a  head-tracked 
HMD.  The  TP2  set  is  configured  as  a  mixed  frame  of  references  where  some  of  the  HUD 
symbology  is  referenced  to  the  front  and  centre  of  the  aircraft,  whereas  other  symbology 
is  referenced  to  the  head  movements  of  the  pilot.  According  to  Gestalt  principles,  the 
head-referenced  (moving)  symbology  will  be  perceptually  grouped  in  an  object  separate 
from  the  aircraft-referenced  symbology.  Importantly,  and  in  accord  with  the  object-based 
hypothesis,  the  results  of  Experiment  5  suggest  that  attention  may  be  differentially 
allocated  to  head-referenced  versus  aircraft-referenced  symbology.  One  potential 
outcome  is  that  processing  of  head-referenced  symbology  may  benefit  from  object-based 
attention.  On  the  other  hand,  a  potential  disadvantage  is  that  it  may  be  harder  for  pilots  to 
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switch  from  processing  the  head-referenced  symbology  to  processing  the  aircraft- 
referenced  symbology.  These  possibilities  need  to  be  systematically  explored. 

2.  FUTURE  DIRECTIONS 

To  date,  the  current  research  program  has  established  a  viable  framework  for 
studying  object-based  attention  and  its  relation  to  other  factors.  In  addition,  necessary 
preliminary  steps  have  been  taken  toward  the  investigation  of  attentional  control  in  HUD 
environments.  This  has  involved  the  study  of  individual  grouping  principles  and  the  use 
of  dynamic  displays. 

Two  parallel  streams.  The  continuation  of  this  line  of  research  will  involve  \york 
in  two  parallel  and  mutually  supporting  streams.  One  stream  will  involve  the  continued 
study  of  dynamic,  more  realistic  displays  in  an  experimental  setting:  this  stream  will 
include  further  research  into  the  development  of  an  object-based  attention  framework. 
The  second  stream  will  involve  the  application  of  the  object-based  framework  to 
investigate  HUD  symbology  in  simulated  flight.  A  concrete  instance  of  this  parallel 
stream  approach  has  already  been  illustrated  in  the  discussion  of  Experiment  5  and  the 
experiments  on  the  TP2  set.  The  TP2  experiments  examined  the  impact  of  grouping 
elements  of  the  symbology  by  common  fate  in  a  setting  approximating  actual  flight 
conditions.  . 

An  important  aspect  of  the  TP2  symbology  set  is  that  the  moving  HUD 
symbologies  are  head-referenced.  This  raises  the  possibility  that  grouping  by  common 
motion  might  be  further  strengthened  by  the  user  controlling  the  motion  of  the  elements. 
To  this  end,  laboratory  research  is  required  to  systematically  isolate  and  assess  the  role  of 
user-controlled  motion  in  attentional  control. 

Another  issue  associated  with  the  TP2  mixed-referencing  configuration  is  to 
determine  the  conditions  under  which  the  head-referenced  elements  are  useful  versus  the 
conditions  where  head-referencing  might  be  detrimental  to  performance.  In  particular,  it 
is  unclear  whether  the  facilitating  effect  of  allocating  attention  to  the  head-referenced 
symbology  inhibits  the  allocation  of  attention  to  other  elements  in  the  display,  and  to 
what  extent  this  inhibition  might  be  detrimental.  This  is,  of  course,  a  question  about  the 
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fundamental  nature  of  object-based  attention.  The  effective  visual  processing  of  a  scene, 
such  as  a  HUD  environment,  requires  the  integration  of  information  from  many  different 
elements  in  many  different  parts  of  the  scene.  It  is  known  from  the  object-based  attention 
literature  that  the  integration  of  information  is  facilitated  by  the  perception  that  the 
sources  of  that  information  constitute  a  coherent  whole.  The  grouping  of  individual 
elements  in  a  display  into  wholes  is  based  on  a  number  of  grouping  principles,  such  as 
grouping  by  proximity,  grouping  by  colour,  good  continuation,  and  grouping  by  motion, 
to  name  a  few.  Thus,  these  principles  can  be  put  to  good  use  to  facilitate  the  integration 
of  information.  However,  not  all  grouping  principles  are  appropriate  to  all  situations.  For 
instance,  grouping  by  colour  cannot  be  used  in  NVG  displays,  as  these  are  monochrome. 
By  the  same  token,  grouping  certain  elements  of  a  symbology  by  common  motion  would 
be  counterproductive  if  some  of  the  elements  are  informative  only  in  a  fixed  frame  of 
reference. 

In  sum,  making  use  of  the  object-based  attention  paradigm  in  the  design  and 
evaluation  of  HUDs  requires  a  good  theoretical  understanding  of  the  nature  of  the 
perceptual  grouping  and  the  effects  of  its  various  mechanisms  and  their  interactions  on 
attention.  It  is  important  to  determine  which  grouping  principles  form  the  most  coherent 
groups  and  the  most  object-like  shapes.  It  is  also  important  to  determine  the  extent  to 
which  the  coherence  of  these  forms  of  perceptual  grouping  (degree  of  "objectness") 
related  to  the  capturing  and  maintenance  of  attention.  To  this  end,  it  would  be  useful  to 
develop  metrics  of  the  coherence  of  perceptual  elements.  Such  metrics  will  require  an 
interdisciplinary  approach,  bringing  together  insights  and  methods  from  cognitive  and 
perceptual  psychology,  neuroscience,  artificial  intelligence  (in  the  form  of  computer 
vision)  and  philosophy. 

In  using  an  object-based  approach  to  developing  HUDs,  it  is  important  to  index 
the  hierarchy  of  grouping  principles,  where  one  perceptual  grouping  principle  overrides 
others.  Similarly,  it  is  necessary  to  know  whether  a  single  grouping  principle  is  sufficient 
for  forming  an  object  or  whether  more  than  one  grouping  principle  must  be  put  into 
place.  It  is  possible  that  the  addition  of  a  second  grouping  factor  to  a  HUD  symbol  will 
enhance  the  sense  of  the  symbol’s  objectness.  It  is  not  clear,  however,  whether  the  extent 
of  a  symbol’s  “objectness”  is  related  to  degree  of  object-based  attention  assigned  to  the 
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symbol.  For  instance,  HUD  symbologies  that  are  too  "strong"  as  objects  may  impact  on 
cognitive  tunnelling.  Further,  it  is  imperative  to  assess  the  relation  between  the  influence 
of  spatial  cueing  and  object-based  attention  in  dynamic  contexts. 

Summary.  In  sum,  the  research  summarized  in  this  report  demonstrates  that  an  object- 
based  attention  framework  is  a  viable  and  useful  approach  for  HUD  research  and 
development.  The  laboratory  research  that  has  been  conducted  compliments  and  directly 
impacts  on  the  HUD  development  of  HUD  and  HMD  systems  such  as  those  proposed  by 
the  TP2  panel.  The  further  development  of  HUD/HMD  systems  will  benefit  from  the 
continuation  of  this  parallel  laboratory  and  simulator  research  program. 
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